EVALUATING 
STUDENT 
PROGRESS 


IN THE SECONDARY SCHOOL 


$ 


CL 


EVALUATING 
STUDENT 
PROGRESS 


IN THE SECONDARY SCHOOL 


ALFRED SCHWARTZ & STUART C. TIEDEMAN 


UNIVERSITY OF DELAWARE DRAKE UNIVERSITY 


WITH THE ASSISTANCE OF 


DONALD G. WALLACE 


DRAKE UNIVERSITY 


EVALUATING 
STUDENT 
PROGRESS 


IN THE 
SECONDARY SCHOOL 


1957 
LONGMANS, GREEN AND CO. 


NEW YORK LONDON TORONTO 


LONGMANS, GREEN AND CO., INC. 
55 FIFTH AVENUE, NEW YORK 3 


LONGMANS, GREEN AND CO., LTD. 
6 & 7 CLIFFORD STREET, LONDON W 1 


LONGMANS, GREEN AND CO. 
20 CRANFIELD ROAD, TORONTO 16 


37/. 46 
SEM TIEDEMAN of 


EVALUATING STUDENT PROGRESS 
IN THE SECONDARY SCHOOL 


COPYRIGHT * 1957 


BY LONGMANS, GREEN AND CO., INC. 


INCLUDING THE RIGHT TO REPRODUCE 


ALL RIGHTS RESERVED, 
IN ANY FORM 


THIS BOOK, OR ANY PORTION THEREOF, 


PUBLISHED SIMULTANEOUSLY IN THE DOMINION OF CANADA BY 
LONGMANS, GREEN AND CO., TORONTO 


FIRST EDITION 


LIBRARY OF CONGRESS CATALOG CARD NUMBER 57-8357 


Printed in the United States of America 


Preface 


CLASSROOM REALITIES compel teachers both to measure and to evalu- 
ate student behaviors. Report cards, standardized tests, informal 
tests, and guidance conferences are symbolic of the current uses of 
the tools and techniques of measurement and evaluation in many 
classrooms. The need to employ all applicable techniques of appraisal 
in the schools has become a practical matter faced daily by teachers. 
It is the authors’ belief that a textbook for teachers or prospective 
teachers must be directed at the actual problems found in the class- 
room. Teachers must make effective and efficient use of the tools and 
techniques of measurement although they may not be statisticians or 
specialists in test construction. The authors believe that this book 
will enable teachers to secure the necessary knowledges and under- 
standings needed for securing measurements and making evaluations. 
This book is based upon a guidance-instruction rationale which 
holds that learning is most effective when teachers guide students 
through well-organized teaching-learning situations toward well- 
defined goals. This rationale presupposes that measurement and 
evaluation are logical divisions of the total teaching-learning process. 
Every effort was made by the authors to develop a realistic text- 
book for teachers based upon sound principles of measurement, in- 
struction, guidance, and curriculum development. A book based upon 
sound theory should help produce sound practices. 
Grateful acknowledgment is made to Donald G. Wallace, Dean, 
Graduate Division, Drake University, for assisting in the develop- 
ment of the outline upon which this book is based, and for contrib- 


v 


vi PREFACE 


uting basic portions of Chapters 5, 6, and 8. Only ‘the authors, 
however, are responsible for the final draft of the book. 

For their courtesy in permitting the quotation of significant text- 
book passages, segments of standardized tests, and other materials 
the authors are indebted to many individuals and publishers. In each 
instance, proper documentary acknowledgment has been given. 

The authors also wish to recognize the assistance of their many 
students and teachers who have developed materials and ideas used 
throughout the book. To Harlan L. Hagman, Dean, College of Edu- 
cation, Drake University, the authors offer their thanks for constant 
encouragement and friendly persuasion. To Delle Schwartz, Regina 
Tiedeman, and Genevieve Wallace they extend sincere appreciation 
for being most patient. 

ALFRED SCHWARTZ 
Sruart C. TrEDEMAN 


S 9 2 


t Contents 


Evaluation in Education 


The When, What, Who, Where, and How of Evaluation 


- Identifying Educational Outcomes 


Determination of Classroom Objectives 


«` A Measurement Rationale 
. General Suggestions for Test Construction 


* Constructing and Using Objective Tests 


Construction and Use of Essay and Short-Answer Tests 
Checklists, Rating Scales, Inventories, and Questionnaires 


Use of Observation, Anecdotal Records, and Interviews 


. Using Sociometrics, Sociodrama, Autobiography, and 


Other Informal Techniques 
The Case Study and the Case Conference 
Standardized Tests—Some General Considerations 


Standardized Tests—Application 
vii 


140 
156 
190 


213 
242 
261 


278 


viii CONTENTS 
15. Interpretation of Test Scores 


16. Diagnosis from the Results of Measurement 

L 7. Guiding Student Progress 

18. Reporting Pupil Progress 

19. Evaluation and the Teaching-Learning Situation 
Selected References 
Index 


320 
345 . 
373 
388 
413 
427 


429 


List of Figures 


l. Evolution of educational objectives 33 
2. Continuum development of behaviors 42 
3. Relationship of a specifit objective to learning experiences JS 
and resultant behaviors i. AS K 
4.. Two-dimensional grid chart \ 2 \ dift ) 
5. Test item pool card N OS 
6. A typical checklist 158 
7. Self-rating checklist 162 
8. Summary of ratings in cooperation 169 
9. Major kona and selected items from the SRA Youth 
Inventory, Form A 180 
10. Sample questionnaire 182 
|l. Sample anecdotal record form 200 
12. Multiple anecdotal record form 201 
13. Multiple anecdotal record form 202 


ix 


x LIST OF FIGURES 


14. Tabulation form for sociometric data 
15. Sociogram of a junior high group 
16. Model personal data blank 

17. Profile chart of Susan’s test record 


18. Summary profile sheet, California Test of Mental 
Maturity 


19. Meaning of stanine scores 
20. Three different distributions of similar test scores 


21. Illustrations of fitted normal curves 


N 
I 


Relationship of various types of scores to the normal 
distribution curve 


23. Scattergram illustrations of the relationship between two 
variables 


N 
= 


Forces operating as determinants of student behavior 


25. Sample profile, Cooperative School and College Ability 
Test 


nN 
ge 


Sample profile, California Achievement Test 
27. Sample profile, Jowa Tests of Educational Development 


28. Relationship of English marks and scores received on the 
Multiple Aptitude Test 3, Language Usage 


29. Sample expectancy chart 
30. Conventional report card 


31. Section of report form used at the University of Chicago 
Laboratory School 


346 


390 


396 


LIST OF FIGURES xi 
32. Art section of report form used at the University of 


Chicago Laboratory School 397 
33. Social studies section of the report form used at the Uni- 

versity of Chicago Laboratory School 398 
34. Music section of the report form used at the University 

of Chicago Laboratory School 399 
35. English section of the report form used at the University 

of Chicago Laboratory School 400 
36. Section of report form used by Rich Township High 

School, Park Forest, Ill. 401 
37. Section adapted from pages 2 and 3 of report form of 

Sheboygan Falls, Wis. 402 
38. Sample self-reporting form 408 
39. Student-teacher reporting form 411 


> 


(Jem 


mo 


b 


List of Charts 


Relationship of Over-all Purpose of Education to All- 
School Objectives, to Specific Classroom Objectives 


Teaching Social Dancing in Physical Education ^ 


. Educational Objectives in Art 


Educational Objectives in Core Classes 
Biology: Reproduction of the Flower 
Profile Index 


List of Tables 


Final test scores in spelling for the sixth grade at a sample 
school, 1957 


. Mat differential intelligence norms—grades 10, 11, 12— 


males 


. Predictive efficiency of coefficients of correlation of varying 


magnitude 


326 


339 


343 


eM 


EVALUATING 
pen STUDENT 
PROGRESS 


IN THE SECONDARY SCHOOL 


CHAPTER 
I 


Evaluation in Education 


EVALUATION IS THE PROCESS of making judgments and coming to 
decisions about the value of an experience. The process consists of 
two elements: (1), a goal or objective for the experience to be evalu- 
ated must be set, and (2), some measure of amount, status or prog- 
ress must be made. An evaluation of the experience then involves 
a carefully considered judgment as to the adequacy or effectiveness 
of the experience as measured in the light of the objectives set for it. 

In education, evaluation is the process of judging the effectiveness 
or worth of an educational experience as measured against instruc- 
tional objectives. Evaluation makes use of measurement, but is not 
limited to it, nor synonymous with it. Measurement never gives more 
than an answer to the question, “How much?” Evaluation, on the 
other hand, seeks an answer to the question, “Of what value is this 
measure of amount, status, or progress when compared with the in- 
structional objectives?” 

That the same measure of amount may mean quite different things 
when seen in relation to different values or objectives may be shown 
in the following example from a classroom. On an American history 
examination, Joe, Mary, and Fred each answered correctly seventy- 
five of the one hundred items in the test. Measurement provides a 
score of 75 for each student. Should one of the major purposes of in- 
struction be “to insure that all students achieve at least a minimum 
level with respect to subject matter competency,” and the teacher 
has defined 70 as this minimum, the evaluation of the examination 


2 EVALUATING STUDENT PROGRESS 


grades for these three students leads to the conclusion that all three 
are making satisfactory progress. 
ese same test grades can be appraised against a different objec- 
tive, one that states that the purpose of instruction is “to help each 
student learn and develop to the best of his ability." This objective 
requires the introduction of at least a second set of measurements, 
scores from a valid and reliable scholastic aptitude test. Assume 
that, from this second test, it is found that Joe has an I.Q. of 80, 
Mary an I.Q. of 100, and Fred an I.Q. of 140. With this information 
and in the light of the second objective, it could hardly be concluded 
that each of the three students was making satisfactory progress. 
'The basic measurement of 75 in American history for each student 
remains the same, but when evaluated against different criteria, 
different conclusions materialize. In this example, it might be con- 
cluded that Mary is doing as well as can be expected, that Joe is 
_ achieving at a higher rate than might be expected (and that he 
should be encouraged and helped to continue at this rate), and that 
Fred is doing definitely poorer work than he is capable of doing (and 
that he should be encouraged and helped to perform at a rate more 
in accord with his ability). 

Measurement, by whatever means it may be accomplished, be it a 
carefully constructed standardized test in mathematics or a rating 
scale designed to measure home or personal adjustment, is a basic 
part of the evaluation process. But measurement is not enough. 
Measurements must be seen in terms of human values and goals. 
Evaluation, focused upon philosophically and psychologically sound 
objectives, and based upon the best measurements that can be se- 
cured, is a key to securing effectiveness in the total educative process. 


USE OF THE PROCESS OF EVALUATION 
IN NONSCHOOL AREAS 


The use of the techniques of evaluation is not confined to the 
public schools. In daily living there are innumerable opportunities 
to appraise individuals, events, and things. As we meet new people we 
judge them subjectively against a set of personality values which 
we hold. We listen to a political argument on television and judge 
the merits of the speaker against our preconceived notions of what 
is right and what is wrong. We go shopping for a new suit and judge 


| 
| 


EVALUATION IN EDUCATION 3 


its quality, cost, and appearance before we make our purchase. There 
is very little reason to doubt that much of personal evaluation is 
based upon whim, prejudice, and fashion, and not upon objective a: 
appraised in the light of carefully developed standards. Evaluation 
based upon whim, prejudice, and the fashions of the moment has no 
place in education or in the many other fields where the techniques 
of evaluation are employed. 

The importance of the role which evaluation plays in nonschool 
areas becomes apparent when the use of the techniques of measure- 
ment by the armed forces, civil service, business and industry, pro- 
fessional groups, and certifying and licensing agencies is reviewed. 
Perhaps the most extensive use of evaluation instruments has been 
made by the armed forces in their efforts to utilize the nation’s man- 
power effectively. During World War II the Army used the Army 
General Classification Test (AGCT) to screen over ten million in- 
ductees and enlistees. The AGCT was used to assist classification 
personnel in assigning inductees to particular specialties within the 
armed forces. It was also general practice to establish a minimum 
AGCT score that had to be attained by a candidate for Officers’ 
Candidate School. 

There were numerous other evaluation techniques used by the 
armed forces to assist in the classification process. Personnel records 
of civilian and military experiences were carefully kept so that the 
full potential of the individual could be appraised and used. Special 
tests were also used to determine whether or not a member of the 
armed forces could qualify for various military schools. One such 
test consisted of a paired series of dot-dash sounds that the individual 
had to identify as similar or dissimilar. Again, a minimum score was 
established and the individual qualified for training in communica- 
tions if he attained at least the minimum. 

Perhaps one of the most important evaluative techniques used was 
the interview. Each inductee was interviewed at a classification cen- 
ter and the results of the interview often determined his assignment. 
In recent years a new test has been developed to assist all branches 
of the armed forces in the screening of recruits. The Armed Forces 
Qualification Test (AFQT), which includes vocabulary, arithmetic 
reasoning, and spatial relations items, is being used just as the AGCT 
was used by the Army during World War II. 


4 EVALUATING STUDENT PROGRESS 


The use of these techniques by the armed forces does not provide 
an absolute guarantee that once inducted into military life the teacher 
of Latin won’t become a cook or the civilian cook won’t become a 
language instructor. To the credit of the armed forces, however, faced 
with the necessity of screening and classifying millions of men 
and women, the use of techniques of evaluation prevented (and 
is preventing) fewer square pegs being placed in round holes 
than would have occurred if systematic evaluation had not taken 
place. 

Civil service, another important nonschool area, developed in the 
United States as an antidote to a political “spoils system” and in an 
effort to secure competent government employees. Civil service ex- 
tends to over 90 per cent of all federal employees and also takes in a 
large percentage of state and local government employees. At the 
heart of the civil service system is an evaluation program to screen 
applicants for positions. While the pattern may vary slightly for 
different governmental units and does vary from position to position, 
the process of evaluation is about as follows: Each government job 
is assigned a position classification and a description of what indi- 
viduals are actually doing in a position is prepared. Job specialists 
then determine the qualifications needed by persons doing the work. 
When necessary, examinations are set to fill vacancies in the civil 
service register. Applicants must complete registration forms which 
are then appraised to determine whether or not the applicant can 
qualify by education and/or experience to take an examination for 
a particular position. Examinations covering general intelligence, spe- 
cific aptitude, or general ability may be given. The applicant must 
achieve a predetermined minimum score to qualify for the next step. 
The applicant then is interviewed by a trained employment counselor 
and the results of the initial application form, the test, and the inter- 
view are combined and the individual is assigned a place on the em- 
ployment register. Evaluation in civil service does not end when the 
individual is assigned his job. Continuation in service, promotions, 
and certain salary raises are directly related to supervisory ratings 
based upon the results of observations, interviews, and other evalu- 
ative techniques. 

Application forms, interviews, supervisory ratings, and general 
ability tests are used as frequently by personnel offices in business and 


Se 


EVALUATION IN EDUCATION 5 


industry as they are in government. Business and industrial groups 
are also making extensive use of aptitude tests in an effort to assign 
personnel properly and to determine potential candidates for addi- 
tional training and advancement. Clerical aptitude, mechanical apti- 
tude, and motor dexterity tests are used as business and industry 
seek to make the best use of the available manpower. 

Among professional groups, teachers are not alone in their use 
of evaluation techniques. Medical personnel start with the basic fact 
that diagnosis must come before prescription. Physicians make ex- 
tensive use of X-ray data, blood tests, tissue analysis, and cardio- 
grams in their efforts to diagnose human ills. Psychologists and 
psychiatrists use many of the same evaluation techniques used in 
the field of education, including projective techniques such as the 
Rorschach (inkblot) and the Thematic Apperception tests. In addi- 
tion to these techniques, psychologists use many personality rating 
scales and inventories. Attorneys and social workers employ varia- 
tions of the case-study technique as they appraise their clients or 
cases. Attorneys draw together all of the pertinent legal decisions 
relative to a specific issue and make their decision on the basis of the 
data they discover. Social workers use the case-study approach to 
secure pertinent information about an individual, his family, the 
environment in which he works and lives, and all other data that will 
help him to assist the client and his family. Engineers make use of 
test borings, land surveys, and research to assist them in making 
decisions about the building of bridges, buildings, and battleships. 
For each of the professional areas the evaluation techniques vary, 
but their use is basic to the progress of the profession. 

Many states have established legal requirements for the certifica- 
tion and licensing of persons in various trades and professions, based 
upon the successful passing of an examination. While the vocations 
covered by these statutes vary from state to state, the range takes 
in the areas from animal husbandry to X-ray technicians. Attorneys 
take specialized bar examinations, doctors must pass medical board 
examinations, accountants must pass highly specialized examinations 
in order to become certified public accountants, and professional and 
nonprofessional candidates jn numerous areas are evaluated before 
they are permitted to practice their trade or profession. By this 
means the individual trade or profession seeks to protect itself against 


6 EVALUATING STUDENT PROGRESS 


fraud and deceit, and the state seeks to protect the citizens within 
its jurisdiction. 

Evaluation varies from the highly subjective first impressions 
gained when we meet someone to the refined, scientific objective 
techniques used by scientists in all fields. T he success of evaluation 
techniques can be measured only within a given field, for the methods 
used by the doctor would not be usable by the engineer, and the 
engineer’s methods would not be useful to the certified public .ac- 
countant. 


PRECEDENTS OF THE MODERN CONCEPT OF 
EDUCATIONAL EVALUATION 


The roots of the concept of educational evaluation are lost in an- 
tiquity. It is known that the Spartans of ancient Greece appraised 
the ability of their citizens through the use of various tests that were 
designed to measure the individual’s physical prowess ; the Chinese 
used examinations as early as 225 s.c. to select civil servants ; and it 
is even recorded in the Bible that a test of the ability to pronounce 
“shibboleth” was used to screen friend from foe. It is reasonably cer- 
tain that some kind of formalized testing has been used in schools 
since their inception. The birth of the modern testing movement, 
however, can be traced to the beginning of the twentieth century, al- 
though some educational leaders in the nineteenth century were 
beginning to criticize the then-existing pattern of examinations. 

From France, England, and Germany came the impetus for the 
early twentieth-century testing movement, although it wasn’t long 
before the movement developed a distinctly American pattern. In 
England Sir Francis Galton developed a wide array of statistical 
techniques, Wilhelm Wundt in Germany developed the first experi- 
mental laboratory in psychology, and Alfred Binet working in 
France developed the first scale for the measurement of intelligence. 
Each of these men and his students stimulated the measurement 
movement not only in his own nation but in the United States as 
well. It is often difficult to indicate precisely the direct contribu- 
tions that individuals make to history, but since the history of the 
testing-measurement-evaluation movement is less than seventy-five 
years old, the contributions of Galton, Wundt, and Binet seem to be 
clearly defined. 


EVALUATION IN EDUCATION 7 


In the United States the extension of public school education 
through the secondary school and the work of Binet, Galton, and 
others appeared to stimulate psychologists and educators to take a 
further look at the problems of testing and measuring in American 
education. Edward L. Thorndike developed standard tests and scales 
for measuring achievement, Lewis Terman adapted the Binet intelli- 
gence scale for use in the United States, Rudolph Pintner and Gerald 
Paterson developed nonlanguage intelligence tests, and the work of 
Arthur H. Otis contributed to the development of the first group test 
of intelligence, the Army Alpha. During the World War I period, 
studies reporting on the unreliability of existing tests, plus the work’ 
of many individuals in colleges and universities, opened the way for 
the testing movement of the twenties. 

_ During the decade following World War I the testing movement 
came into its own. Rapid developments took place in mental, achieve- 
ment, and aptitude testing. Groups at universities began to develop 
„standardized tests that covered a wide range of academic areas. Com- 
mercial concerns were quick to see in this development many oppor- 

- tunities, and numerous tests were prepared and published. Diagnos- 
tic, as well as survey, tests were developed and marketed. McCall 
saw the need for assisting classroom teachers to use objective tests 
and in 1920 he reported on their potential use. During this same 
period, books, articles, and conferences were devoted to the problem 
of testing and measuring. It was becoming fashionable to use 
standardized tests and a good school system could not be without 
its battery of tests and test results. Although many of the instru- 
ments used were poorly constructed and were not particularly useful 
for the classroom teacher, the number of tests bought and used 
continued to grow spectacularly. The new testing movement was not 
accepted as enthusiastically by classroom teachers as by school of- 
ficials, since the teachers could not see its application to teaching. 

The measurement movement in the twenties was largely limited 
to the use of standardized mental ability and achievement tests. 
However, a growing interest in aptitude and personality testing laid 
the groundwork for the decades that were to come. By the time 
World War II erupted, testing specialists had probed into the 
numerous possibilities existing in many areas. Previous to the war, 
however, a study that made a major impact on the field of evaluation 


8 EVALUATING STUDENT PROGRESS 


in education was conducted. The Eight-Year Study of students and 
curricula in thirty secondary schools provided an opportunity for the 
development of a full-scale program of evaluation. Of great signifi- 
cance to the field of educational evaluation was the fact that in this 
study an attempt was made to put an evaluation rationale into op- 
eration. The teachers, administrators, and research workers engaging 
in the task of evaluating the effectiveness of educational programs 
had to develop their own instruments as they sought to evaluate 
study skills, critical thinking, appreciation, and interests. The Eight- 
Year Study marked a turning point, for it showed that testing 
‘specialists had too long been concerned with the knowledge aspects 
of education and had not placed enough emphasis on the so-called 
intangible outcomes of the educative process. The study was also 
significant in that it provided a laboratory in which many of our 
present-day specialists in evaluation were able to study at firsthand 
the problem of evaluation in education. 

Just as World War I sparked the fire that ignited the testing 
movement, so did World War II create the conditions from which 
has evolved the evaluation movement. Each branch of the armed 
forces recruited its own teams of specialists to study the problem of 
how to use and train most effectively the millions of men and women 
in military service. An opportunity was provided for almost un- 
limited experimentation. Aptitude, mental ability and situational 
tests, personality instruments, interest inventories, interviews, obser- 
vation, and many other techniques were tried. In the period follow- 
ing the war the results of the experimentation were reported in 
books, monographs, and periodicals, which made, and are continuing 
to make, an impact on education, psychology, personnel manage- 
ment, and all other areas in which the process of evaluation is im- 
portant. 

Interest in the measurement of mental ability at the turn of the 
twentieth century has grown in fifty years into an interest in the 
measurement and evaluation of the many different variables that 
make up human behavior. The current trends in the field of measure- 
ment and evaluation are reflected in the following developments: 


1. The importance of the classroom teacher in the process of evalua- 
tion is being recognized and fostered. There is a growing recogni- 


EVALUATION IN EDUCATION 9 


tion of the need for informal evaluation as well as the more formal 
standardized testing. 


. Since evaluation is a concept that includes the value judgments of 


individuals, it has become evident that the teacher and/or test 
expert must not only understand the mechanics of test construc- 
tion and analysis but must also understand the educational objec- 
tives that the school is striving to develop. 


. The statistical emphasis that has placed so much value on the 


reliability of tests is giving way to increasing concern over the 
validity of tests. It has become evident to the leaders in the testing 
field that an instrument with high reliability has very little value 
for the teacher or guidance specialist if it does not measure the 
objectives being promulgated. This trend is evident in the efforts 
by a group of test experts to develop a taxonomy of educational 
objectives that can be used as a guide by test makers. 


. The use of testing to develop achievement standards for a school 


system is giving way to a diagnostic approach which places em- 
phasis on studying the student and attempting to discover the 
causes of his behavior. This concept establishes the necessity for 
studying the individual's social, emotional, and physical status, as 
well as his academic accomplishments and shortcomings. 


. Essay tests, cast into disrepute by the studies in the twenties, are 


coming back into their own as one technique for measuring certain 
types of educational objectives. 


. Standardized test batteries are giving proportionately more atten- 


tion to the broader, general objectives than to the factual knowl- 
edge outcomes of education. 


. It is becoming very clear that test results are not ends in them- 


selves, but are means for providing better individual guidance, 
modifying instruction, and improving the curriculum. 


. To assist teachers in using standardized tests and to improve the 


quality of standardized tests, efforts are being made to develop 
test standards for test makers and test publishers. It is hoped that 
this movement will place the publication of standardized tests on 
a sounder, more professional basis. 


. In the area of mental ability testing great strides have been taken 
"to redefine the concept of intelligence that has prevailed for so 


many years. It is becoming increasingly useful to consider intelli- 
gence as a composite of several factors rather than as a single 
factor. In line with the reanalysis of the concept of mental ability, 
so-called intelligence tests are being classified as measures of 


10 EVALUATING STUDENT PROGRESS 


scholastic aptitude to indicate that many of the current mental 
ability tests are to a large exent indicators of the “academic” abil- 
ities of individuals. 


USES OF EVALUATION PROCEDURES IN EDUCATION 


| Students and teachers frequently associate evaluation with tests 
and the awarding of grades. While both tests and grades are included 
in a concept of evaluation, they have limited uses in the total proc- 
ess) The concept of evaluation to be developed in this book implies 
that an extensive range of techniques must be used for a variety of 
sound educational reasons. What follows is a summary of the uses 
of evaluation by various individuals in a school system.. The sum- 
mary is not meant to be an exhaustive analysis, but an introductory 
statement of the potential uses of the evaluative process. Evaluation 
is used by the teacher— 


To discover students’ individual weaknesses and strengths. 
To secure data that can be used to group students. 
To study the effectiveness of various teaching methods. 
To locate the weaknesses and strengths of the instructional proc- 
ess. 
5. To secure data that will facilitate making decisions about cur- 
ricula revision. 
6. To get information for reporting purposes. 
7. To determine the progress students are making toward instruc- 
tional objectives. 
8. To locate starting points for class work. 
9. To motivate student learning. 
10, To secure information that can be used for individualizing in- 
struction. 
11. To determine when individual students should be recommended 
for specialized remedial programs. 
12. To determine when individual students need additional coun- 
seling. 
13. To encourage students to study their own weaknesses and 
strengths. 
14. To gain an understanding of the interests and attitudes of stu- 
dents. 
15. To secure data for the permanent school records. 
16. To assist in making promotion decisions. 


"ESSE 


EVALUATION IN EDUCATION J 11 


Evaluation is used by the administrator— 


To secure data upon which an appraisal of the entire school or 
school system can be based. 

To study the effectiveness of instruction. 

To provide the data necessary for an appraisal of the curriculum 
offerings. 

To furnish data for public relations purposes. 

To assist new teachers in becoming acquainted with students. 
To secure data upon which to base recommendations for addi- 
tional school needs. 

To secure a gross measure of teaching effectiveness. 

To determine possible grouping patterns within a school or school 
system. 

To encourage the staff to engage in self-appraisal. : 

To develop a continuous pattern of action research. 

To facilitate the functioning of the guidance service within the 
school or system. 


Evaluation is used by counselors, homeroom teachers, and other 
guidance personnel— 


Ben 


To 


To diagnose the individual student’s weaknesses and strengths. 
To assist the student to overcome difficulties and take advantages 
of strengths. 

To discover information that will be of value to the classroom 
teacher in working with the students. 

To locate the causes of social, emotional, and academic problems. 
To assist the student in academic and vocational planning. 

To help parents understand their children. 

To secure data that will be helpful when making referrals. 

To help teachers, administrators, and other school personnel un- 
derstand the problems faced by the students. 


accomplish the above purposes, teachers, administrators, 


counselors, homeroom teachers, boys’ and girls’ advisers, and others 
engaged in educational work make use (or need to make use) of a 
variety of evaluation techniques. In the outline that follows, the 
scope of available techniques is presented. 


I. Tests 
A. Achievement 
1. Informal teacher-made 


12 EVALUATING STUDENT PROGRESS 


2. Standardized 
B. Mental ability 
C. Personality 
D. Aptitude 
E. Interest 
II. Rating scales 
III. Checklists, surveys, inventories, and questionnaires 
IV. Observation 
V. Interviews 
VI. Records and reports 
A. Cumulative folders 
B. Anecdotal records 
C. Diaries and logs 
VII. Sociometry 
VIII. Role-playing 
A. Sociodrama 
B. Psychodrama 
IX. Situational tests 
X. Student projects 
A. Papers 
B. Notebooks 
C. Reports 
D. Autobiographies 
E. Personal data sheets 
XI. Case studies 
XII. Case conferences 


There is no one method or technique of evaluation that is best 
for measuring the wide variety of objectives found in the usual school 
program, and the choice of technique depends almost entirely upon 
the kind of objective to be measured. Measurement is not limited to 
the administration of objective tests, nor is it limited to the use of 
essay examinations, nor to the use of observational techniques. Each 
of these has its place in a total program of evaluation, and each 
can be uniquely useful when used properly. 

A teacher concerned with the ability of pupils to recall and use a 
specific date, or perhaps a chemical formula, could probably use a 
completion-type item for the measurement of this objective. If the 
objective calls for the recognition of the relative importance of sev- 
eral alternative actions, a multiple-choice item might be the best 


EVALUATION IN EDUCATION f 13 


choice. If the teacher is concerned about the ability of students to 
organize data and to express themselves in an unstructured fashion, 
he might use an essay test. If he were anxious to discover social 
groupings within a class, he might employ a sociometric device. If 
an objective of instruction deals with the development of a trait 
such as leadership, the teacher might best observe the actual overt 
behaviors of students and record what he actually sees on anecdotal 
record forms, or he might use a rating scale developed to serve this 
particular need. 

A wide variety of evaluative techniques exists. It is the job of the 
teacher to learn what these techniques are, their proper uses and 
limitations, and how they may best be employed to aid boys and girls 
in their learning experiences. , 


EVALUATION AND HUMAN VALUES 


In education, the focus of teaching and thus of the evaluation 
process, is upon the goals, ambitions, and hopes of human beings. 
Educational objectives are godls for boys and girls, defined in terms 
of desired human behaviors. These objectives spell out those be- 
haviors (not only overt, but the inner mental activity and attitudes, 
as well) and values that society feels will contribute most to the 
individual and to society. With the focus upon boys and girls, and 
upon the values and behaviors which it is hoped will characterize 
their lives, it becomes vital that evaluation processes be consistent 
with democratic principles. 

One of the basic premises of education in a democracy is the belief 
in the worth and integrity of all individuals. This is a fundamental 
concept of the democratic heritage. It is expressed in the concept 
that every person should be respected as a person in his own right, 
and that each person should have an equal opportunity to grow and 
develop to the fullest extent of his capacities. Unfortunately, this 
basic tenet of democracy is frequently violated under the guise of 
standards imposed by the evaluation process. 

Far too often educators have overlooked one of the most important 
outcomes of the measurement movement—a knowledge of the tre- 
mendous range of individual differences in ability within a given 


* age group. It is no longer news to the teacher that a group of twelve- 


year-old children may vary in mental age from eight to sixteen 


14 EVALUATING STUDENT PROGRESS 


years. In terms of academic achievement, it is common to find that 
there are sixth-graders who will do as well on a standardized test in 
a subject matter area as the average twelfth-grader, and that there 
are boys and girls in the twelfth grade who barely equal the 
achievement of the average sixth-grader. Such facts are common 
knowledge to school people, and tend to be taken for granted until 
it becomes necessary to assign grades. Then all too often it is as- 
sumed that a common standard of achievement exists for all students, 
although they vary greatly in academic aptitude. We tend to com- 
pare all members of the group with the “average,” even though we 
know from our measures of aptitude that some are definitely below 
while others are above average. This procedure is patently unfair 
to those boys and girls who make up these two nonaverage groups. 

In far too many cases, students who are far below average in 
academic ability have little chance to enjoy success no matter how 
hard they try. The majority of D and F grades go to these students, 
who, through no fault of their own, have less aptitude for learning 
than the “average” boy or girl. Such’ repeated failure causes these 
students to dread tests and to anticipate eagerly the time when 
they can quit school. 

Society insists that all boys and girls achieve at least a minimum 
level of education. Thus, all children, with the exception of a small 
group of extreme deviates, are forced to attend school. School can be 
fun and it can be worth-while. For those of lesser mental ability, 
though, school can be frustrating, especially if they find that their 
efforts are unrewarded or that they are rejected because they do not 
come up to the “average.” No wonder that many of these children 
develop negative attitudes toward learning, toward school, and per- 
haps toward society as well. No wonder that many lose confidence 
in their ability to learn even those elementary things which their 
limited mental ability would permit. No wonder that many of them 
come to feel that teachers “talk” democracy and the worth and in- 
tegrity of the individual but do not practice what they preach. Under 
conventional grading procedures, it appears that teachers are being 
a little less than fair to students of lesser ability. A kind of evalua- 
tion procedure more consistent with our democratic ideals should 
consider each person’s potential for learning. The student who works 
to the best of his ability should find that his efforts are acceptable 


EVALUATION IN EDUCATION & 15 


and worthy in the sight of his teachers and of his peers. Teachers 
have a great responsibility to see to it that all children have equal 
opportunity to use their talents. 

Conventional measurement practices may also tend to encourage 
superior students to adopt much lower standards of achievement 
than they are able to attain, and tend to give them false success 
feelings. With a grade of C given for average work, the student of 
superior ability is seldom encouraged to achieve at a level in keeping 
with his capacity. If grades of B and A can be earned with a modest 
effort (and in many cases this is all that is demanded of the superior 
student), what incentive is there to achieve at a higher level? 

Many students slide by under conventional evaluation procedures 
that do not encourage them to exercise the individuality that is theirs. 
They should be evaluated in terms of their ability to learn rather 
than in terms of an arbitrary standard. 


EVALUATION CONSISTENT WITH THE PSYCHOLOGY 
OF LEARNING 


Tt is known that students learn best when— 


1, They see a real purpose in learning. 

2. They have enough background of experience to give meaning to 
what they have to learn. 

3. They recognize specific needs. 

4. They have enough success experiences to encourage them to 


further learning. 
5. They have been helped to know something of their potential for 


learning. 
6. Feelings of insecurity, frustration, and doubt have been success- 


fully resolved. 


Evaluation can and should be consistent with the above principles. 
Unfortunately, many of our conventional evaluation practices and 
processes do not help, but rather hinder, effective learning for many 
pupils. As was noted earlier in this section, the children with limited 
ability have little chance for success experiences in schools. They 
soon learn to achieve their “successes,” or peer approval, through 
behavior which most teachers would frown upon. In its extreme form 
such behavior is termed juvenile delinquency. Pupils who experience 
failure every day can scarcely be blamed if they lose interest in 


16 ~” EVALUATING STUDENT PROGRESS 


learning. Pupils who have been discouraged and frustrated by their : 


inability to make normal progress can hardly be expected to use 
their ability to its fullest in experiences that, to them, do not seem 
to have any constructive purpose. 

Evaluation must be based upon comprehensive and continuous 
measurement of all phases of individual development and toward all 


objectives of the school. It will require much data about each pupil’ 
and it will create many teaching and administrative problems. Just. 


as good teaching is not an easy job, so careful evaluation, which is, 
after all, but one phase of good teaching, is not an easy job. Con- 
ventional grading practices, based on a minimum of data most easily 
collected and on comparisons of individuals against the group or 
against arbitrary standards of accomplishment without 'due regard 
for the human relationships involved, are much easier to use. Only 
the best, whether we speak of evaluation practices, school supplies 
and equipment, or teaching personnel, is good enough if the schools 
are to discharge adequately the responsibility society has entrusted 
to them. In its highest form, evaluation aids individuals and groups 
to assume responsibility for their own actions. It is a process that is 
necessary to promote the psychological security of students and 
teachers, secure public support and understanding, and examine the 
progress made by students and teachers. It is considered an action 
that fosters the growth of individuals as individuals and as members 
of the classroom group. Evaluation is the means by which an ob- 
jective, valid, reliable, and usable accounting is made of the progress 
of a classroom group as the students grow academically, socially, 
emotionally, physically, and spiritually. 


TUM ee ee VEA we P 


CHAPTER 


2 


The When, What, Who, Where, 
and How of Evaluation 


WHEN SHOULD EVALUATION TAKE PLACE? 


“+ 


a variety of reasons long after his school years are over. 


appraisal of its educational program. 


Evaluation is, basically, a process of determining the nature, the 
extent, and the desirability of the changes that occur in a student as 
he grows and develops. Since growth and development are continuous 
processes, evaluation must go on continuously if a// the changes in 
the student are to be fully appraised. There are no blank periods in 
human development, periods when nothing important is taking place. 
"Therefore, there can be no blank periods in evaluation. It must begin 
the day the child is born and continue as long as someone is respon- 
sible for his guidance. This usually means until he completes his 
| formal education, although, as noted in Chapter 1, many agencies 
~- and individuals in society continue to evaluate and appraise him for 


Practically speaking, from the standpoint of the school and the 
teacher, evaluation of the child begins with the prekindergarten 
“round-up,” when his health history and physical condition are re- 
corded. It ends, in most cases, when he leaves school, either as a 
graduate or as a dropout. In an increasingly large number of in- 
stances, however, the school is now including in its evaluation pro- 
gram a follow-up study of each school-leaver to facilitate the 


It is obvious, then, that not only must evaluation be continuous, 
it must also be comprehensive, i.e., it must encompass the entire 
range of the student's activities and experiences, including the cur- 


7 


18 > EVALUATING STUDENT PROGRESS 


ricular, the cocurricular, and the nonschool. Further, it must gauge 
his progress in terms of a variety of criteria (variously described as 
objectives, goals, aims, or outcomes), which will be discussed in de- — 
tail in the next chapter. Evaluation has three principal functions—to _ 
determine the present status of the student, to identify the factors | 
which are responsible for, and influence, his growth and development, . 
and to determine his potentialities for future growth and develop- 
ment. These three functions constitute the basis for effective guid- 
ance of the student. 

There is an old saying to the effect that one rose does not make a 
summer, Tt is likewise true that one bit of evaluative data does not 
present a complete picture of an individual. To portray the individual 
as he really is, and to show the pattern of his development accur- 
ately, requires a continuous appraisal and recording of the progress 
of the student in each important area of his growth over a long period 

- of time. It is only through such a long-term evaluation of progress 
that an understanding of cause and effect relationships can be 
built up and a valid picture of the individual developed. 

The answer to the question: When should evaluation take place? 
therefore, is: At all times, in every possible situation or activity, and 
throughout the growth and developmental years of the individual. 


WHAT SHOULD BE EVALUATED? 


The very worthy admonition to teachers, “Teach the whole child,” 
becomes just so many empty words unless teachers know the whole 
child. Teachers cannot separate teaching students from studying 
students. The two processes are inextricably interwoven and one is of 
little value without the other. To attempt to teach the child with- 
out knowing him is not to teach him at all, but merely to teach the 
subject or the book. On the other hand, to know the child and then 
not to utilize this knowledge as a basis for guiding his learning ex- 
periences more effectively is to be wasteful of the resources of the 
school and the potentialities of the student. 

The school is charged with the responsibility, along with such 
other social agencies as the home and the church, for preparing boys 
and girls to live happy and useful lives in a very complex society. To 
discharge this responsibility, the school must provide learning ex- 
periences in a wide variety of situations and areas of living. Every 


WHEN, WHAT, WHO, WHERE, AND HOW 19 


experience that a human being has makes its mark, inasmuch as he 
is never quite the same after the experience as he was before. There- 
fore, it is essential that the evaluation program take note not only 
of the outward (behavioral) manifestations of change in the student, 
but also of those factors, personal and environmental, which impinge 
upon him and influence his behavior and his learning. These factors 
may be classified for convenience in the following categories: (1) 
physical, (2) health, (3) psychological, (4) educational, and (5) 
environmental. i 

To obtain a true picture of the whole student, evidence is needed 
concerning each of these factors and the influence each has upon the 
student's progress, whether satisfactory or unsatisfactory. In each 
category thére are innumerable other specific factors which affect the 
student's development. It would be impractical, if not impossible, 
to enumerate all of them. However, to illustrate the types of factors 
in each category the following brief summary is presented. It is es- 
pecially important to note any unusual or atypical conditions as- 
sociated with the various factors that might either handicap or fa- 
cilitate the student in his development. 


I. Physical factors 

A. Speech 

B. Hearing 

C. Vision 

D. General physical (bodily) condition 
1. Motor coordination 
2. Vigor 
3. Vitality 
4. Growth 

II. Health factors 

A. Eating habits 

B. Sleeping habits 

C. Personal hygiene 

D. Complexion 

III. Psychological factors 

A. Personality adjustment 
1. Social competence 
2. Aggressive or submissive tendencies 
3. Attitudes 
4. Group (peer) relationships 


20 EVALUATING STUDENT PROGRESS 


B. Aptitudes 
1. Scholastic 
2. Specific 
C. Interests 
D. Reading skill 
E. Mental health 
IV. Educational factors 
A. Past achievement—Areas of superiority and weakness 
B. Suitability of curriculum 
C. Cocurricular participation 
1. Special successes or failures 
2. Areas of activity chosen 
D. Suitability of instructional materials and methods 
E. Study and work habits 
F. Teacher-student relationship 
G. Student-school relationship 
nvironmental factors 
Family background 
1. Cultural and social characteristics and influences 
2. Attitudes toward child, school, society 
3. Economic and social status 
4. Marital status of parents 
5. Home duties and responsibilities 
B. Neighborhood factors 
1. Geographical location and characteristics 
2. Companions 
3. Recreation facilities 
4. Employment (full and/or part-time) opportunities 
One of the major defects of the evaluation program in many 
schools is not what is done, but what is not done. Certain factors re- 
lated to student progress are frequently neglected, resulting in only 
a partial picture of the student being obtained. On the other hand, 
certain aspects of the student's progress are appraised in almost 
every instance. In this latter category are: (1) academic (subject 
matter) achievement, (2) physical skills, and (3) mental capacity. 
In the category of “frequently neglected factors” are the student's 
(1) attitudes and opinions, (2) critical thinking ability, (3) mental 
health, (4) work and study habits, and (5) social skills and group 
relationships. 
Frequently teachers say that true-false or multiple-choice tests, 


F 
ac 


WHEN, WHAT, WHO, WHERE, AND HOW 21 


or perhaps essay exams, are fine for measuring how much a student 
knows—how much knowledge, skill, or understanding he has—but 
there is nothing available for measuring the intangibles of education 
that are equally important. The key to the measurement of intangi- 
bles is to be found in a clear definition of purposes. As we learn to 
analyze and define what it is we are trying to do, and as we learn 
how to translate these goals in terms of expected student behaviors, 
we can also learn how to measure those behaviors, The history of the 
measurement movement is a history of how intangibles were made 
tangible and measurable, and how yesterday’s impossible problems 
became today’s routine practices. 

For many years “critical thinking” had been considered an ob- 
jective of education, but there had been little success in measuring 
this important ability until Tyler and his colleagues in the Eight- 
Year Study analyzed the objective into specific behaviors. “What is 
it a person does when he thinks critically ?” “What behaviors differ- 
entiate the clear thinker from the fuzzy thinker?” When critical 
thinking was thus defined it became apparent that it embraced a 
number of skills, such as interpreting graphs and tables of data in 
certain precise ways, reasoning by authority or by analogy, and so 
on. As soon as the objective was defined, it became possible to meas- 
ure it. Critical thinking has ceased to be an intangible, although 
there are still too few teachers who are willing to spend the time 
necessary to master the techniques of measurement in this area. Our 
instruments to measure aspects of critical thinking are not all per- 
fect, but we do have measures in this important area where none 
existed before, because a few experimentally inclined teachers were 
willing to analyze an objective carefully to see what it really implied 
for measurement. 

An objective that is intangible to the extent that it cannot be 
described in terms of student behaviors cannot be effectively reached. 
Conversely, if we can define a goal in behavioral terms, and we have 
Some reasonable assurance that what we are trying to do is being 
learned, then measurement can take place. It may be that the first 
attempts at measurement in a critical area of learning will be weak 
and the instruments that we develop will at first satisfy few of the 
Criteria we shall soon discuss. However, if we expect students to 
learn new skills and to improve inadequate ones, then teachers need 


22 EVALUATING STUDENT PROGRESS 


to continue to learn and to improve their practices in measurement 
and in evaluation. 


WHERE DOES EVALUATION TAKE PLACE? 

To appraise the progress of the student on the basis of the various 
objectives and factors previously mentioned, evaluation cannot be 
restricted to the classroom. It must also take place in the extra- 
class school activities; in the home, the neighborhood, the com- 
munity, and in the place of employment. The youth is a “different” 
person in each of these situations and places. His activities, goals, 
needs, and skills, as well as his associates and his environment, differ 
in each one. 

In the classroom the teacher determines the status of, and the 
changes in, the student’s— 


1, Subject matter knowledge and skills. 

2. Mental ability. 

3. Work and study habits. 

4. Attitudes. * 

5. Personal adjustment—response to success and frustration, emo- 
tional stability. 


6. Individual and group relationships. 
7. Interests and special talents. 

8. Reading ability. 

9. Health. 

10. Physical condition. 

11. Language usage. 

12. Personal appearance, dress, posture, and so on. 

13. Disposition. 

14. Acceptance of responsibility. 

Clubs, dramatics, athletics, social organizations and activities, and 
music groups provide opportunities to appraise the student in the 
same areas as in the classroom but in different settings and under 
different circumstances. The activities are frequently less formal 
than in the classroom and, as a result, the student is somewhat freer 
to act normally and to express himself with less restraint. 

In the kome, a social situation exists which is entirely different 
from that in the classroom or other school activities. The same is 
true in the neighborhood and the community, where the youth’s 
physical surroundings, as well as his personal relationships, differ. 


———— 


WHEN, WHAT, WHO, WHERE, AND HOW 23 


Consequently, he must vary his behavior to meet the changed situa- 
tion. He may find, for instance, that the aggressive behavior which 
makes him a leader in his peer group at school only creates conflicts 
with parents and older siblings at home. As a result, he may become 
a very submissive young man at home to effect the best possible ad- 
justment to the situation. Thus, an evaluation of his behavior at 
school would label him as “aggressive,” while an interview with his 
parents would undoubtedly result in describing him as “submissive.” 
The youth’s behavior in the local drugstore or hangout, church, or 
on the street is likewise dependent upon the demands of the situa- 
tion, for in every instance he is attempting to adjust to conditions in 
ways that give the greatest promise of satisfying his basic needs for 
recognition, security, status, success, and happiness. 

When the teenager takes a part-time job as a clerk, filling-station 
attendant, or delivery boy, he is in a position where he is evaluated 
by the boss, by the customers, and by his associates. They appraise 
his personality, his job know-how, his attitudes, and many other 
characteristics. In addition to appraising kim, they also appraise the 
school from which he comes, since the public forms its opinion of 
the school by the “product” it turns out. Often the school asks the 
employer for an appraisal of the student-employee. The student may 
also ask the employer for an appraisal of his work in the form of a 
recommendation. The employer’s evaluation may even become a part 
of the student’s cumulative record, as is true if the employment is 
part of the school’s work-experience program. If the school maintains 
a placement service for its students, or if it conducts regular follow-up 
studies of its former students, it actually solicits written evaluations 
from employers and keeps them on file. 

It is significant that almost all employers are equally, if not more, 
interested in the personality traits of the student-employee than in 
his knowledge or his job skill. This should be a “tip-off” to the school 
that character and personality development should be stressed just as 
much as, if not more than, subject matter. 


WHO EVALUATES THE INDIVIDUAL AND THE SCHOOL? 


Since evaluation takes place not only in the school, but also in the 
home and in the community, many persons in addition to school 
personnel participate in the evaluation. Further, it must be recog- 


24 EVALUATING STUDENT PROGRESS 


nized that not only is the student evaluated, but the school as well. 
Evaluation of the student and the school, therefore, takes place 
both inside and outside the school plant. The intraschool evaluation 
- is carried on by the teachers, the administrators, the school board, 
and the students. The members of these groups have their own cri- 
teria and their own purposes and reasons for the evaluation they 
make. 

The teacher's primary evaluative function, of course, is the ap- 
praisal of the progress made by the student in his physical, mental, 
social, and emotional development. The criteria for this evaluation 
are generally the functional course objectives which have been set up 
cooperatively by the teacher and the students, In order to obtain 
the requisite environmental and background information against 
which to judge the student's growth, the teacher also appraises, in- 
formally, other people and other things—the superintendent; the 
principal ; the teachers; the philosophy of the school ; the curriculum ; 
the student personnel services including counseling, testing, cumu- 
lative records, the medical and nursing program, extracurricular of- 
ferings, placement, and work experiences; the building ; facilities for 
library, athletics, drama, music; equipment such as audio-visual, 
laboratory, and duplicating machines ; and clerical and secretarial as- 
sistance. This is not a complete list of all the various phases and 
aspects of the school which the teacher evaluates; it is merely il- 
lustrative. i 

The administrative personnel of the school have somewhat different 
reasons and purposes for their evaluation ; consequently, their evalua- 
tive criteria are also somewhat different. The superintendent, having 
little or no personal contact with many of the students, usually does 
not make an effort to evaluate them directly. He does, however, 
carry on a constant appraisal of all the other aspects of the school 
which the teacher also evaluates. The ultimate purpose of his evalua- 
tion is the same as the teacher’s—to find out whether the school is 
doing the best possible job of educating youth. One of his most useful 
evaluative techniques is the follow-up of his former students. 

The principal’s position places him in closer contact than the 
superintendent with teacher and students. Therefore, his evaluation 
of the school is concerned not only with the ultimate outcomes of the 
school’s educational program but also with the immediate results 


WHEN, WHAT, WHO, WHERE, AND HOW 25 


as seen in the day to day behavior of students. In most schools he is Wee 
responsible for the supervision of teachers, also a form of evaluation. //“ 

Another of the administrative agencies of the school which is in. 
terested in evaluation is the board of education. The particular evaluJ 
ative function of this body is the appraisal of the over-all program 
of the school to ascertain whether or not it is in accord with the 
established policies and philosophy of the board's acting as the 
elected representatives of the community and expressing community 
opinion. The board thus represents the middle man, officially binding 
the school and the community together, and acting as a sounding 
board of community attitudes toward the school and its functions. 

Many persons outside the school family, individually and in 
groups, appraise the school in one way or another and for a variety 
of reasons and purposes. Probably the group most directly concerned 
with the school's effectiveness in guiding the growth and development 
of children is made up of the parents. Each parent evaluates his 
child upon the basis of his own standards or criteria and either 
praises or condemns the schoot for the results. Banded together in 
parent-teacher associations, parents appraise the school, its teachers, 
administration, activities, objectives, building, facilities, and equip- 
ment. 

As mentioned earlier, the employer also has a stake in the school, 
since many graduates, as well as dropouts, become his employees. His 
interest is primarily concerned with the ability of the employee, the 
former student, to adapt himself to the conditions existing in his 
establishment and to work harmoniously with his fellow workers. 

The total citizenry, commonly called society, judges the school in 
terms of the public behavior of students and former students, and 
the contributions they make to the life of the community and all of 
its components, such as churches, societies, and organizations. In 
Many instances these subgroups of the total citizenry appraise the 
school according to their own unique criteria or standards of success- 
ful citizenship. Although the criteria of these subgroups do not al- 
Ways express the attitude or the sentiment of society as a whole, 
these groups may at times be very vocal in their criticisms (much 
less frequently, commendations) and create serious problems for the 
School, which must try to satisfy all elements of society. 

Colleges represent another group which evaluates the school in 


V 


26 EVALUATING STUDENT PROGRESS 


terms of its product—the graduate. Colleges are not content to ac- 
cept the grades of this or that student as an automatic guarantee of 
his capacity to succeed in college work without knowing the school 
from which he graduated. Further, the college is becoming more and 
more concerned with the social and emotional maturity of the new 
student, and not only asks the sending school for information of this 
kind, but also makes careful appraisals of the student’s personality 
traits and characteristics as a part of its own evaluation program. 
Information about the student’s success or failure in college is fre- 
quently requested by the secondary school as a part of its follow-up 
studies, 

Finally, the school is evaluated by the several accrediting agencies 
and associations whose criteria include the adequacy of the training 
of the teachers and administrator, the curriculum, the school plant, 
facilities, and equipment. 

Every individual or group that evaluates the student or the school 
has its own special reason for doing so, Ultimately, however, the de- 
sire of each is to improve the quality of the training provided for the 
youth of the nation. To that end all are dedicated—students, teach- 
ers, administrator, parents, employers, and the total citizenry, as 
well as the many subgroups in society. 


HOW SHOULD EVALUATION BE CARRIED ON? 


Now that we have considered the when, what, who and where of 
evaluation, there remains only the kow to complete the picture. 

The school’s appraisal and records should be focused on the indi- 
vidual student; they should be concerned both with his present 
Status in terms of his capacities and achievements, and with the 
telation of his status to expected growth patterns, Evaluation should 
determine the desirability of the growth process in terms of the ex- 
pected outcomes. i 

There are four steps involved in the process of providing an effec- 
tive and functional “education for life” for every student: 

1. Determining the student’s status. 

This involves the use of tests and appraisal techniques to obtain 
evidence concerning his mental capacity, his subject matter knowl- 
edge and skills, his health and physical condition, his social and 
emotional maturity, and his home and family. 


2; 


WHEN, WHAT, WHO, WHERE, AND HOW 27 


Determining his needs on the basis of his status. 

The student's strengths and weaknesses will become apparent 
as we become familiar with his present status. 

Determining, on the basis of his needs, the experiences which 
appear to have the greatest chance of meeting his needs, 

This implies that the curriculum must be flexible to allow the 
educational program to be adapted to meet the individual student’s 
needs. 

Evaluating to determine growth in the direction of meeting those 
needs or desirable goals. 

The same instruments and techniques should be employed for 
this purpose as in step 1, since the same elements are to be 
re-evaluated to determine status following a period of learning and 
growth. 


On the basis of this re-evaluation (and it must be thought of as 
being a continuous process, not a before-and-after proposition), new 
needs may be discovered and lack of progress toward objectives may 
be noted. It is entirely possible, then, that the student’s program 
may need partial or complete overhauling to make it more effective. 

This rather general discussion of the relationship of evaluation to 
the over-all educational program of the school is incomplete in one 
important respect, however; it fails to describe the process of evalu- 
ation itself. Evaluation proceeds according to a rather well-estab- 
lished plan and in logical steps: 


tn 


General objectives are formulated for the school. 

These objectives are defined and expressed in terms of kuman be- 
havior or functional objectives. 

Situations are identified or arranged in which the described form 
of behavior is exhibited and may be observed and analyzed. 
Instruments, techniques, and devices are selected or developed to 
measure or appraise the students’ behavior in the appropriate situ- 
ations (step 3). 

The instrument, technique, or device is administered or employed. 
The results of the measurement or appraisal are judged or evalu- 
ated in terms of the objectives. 


The general objectives of the school ought to be a product of the 
entire staff’s working under the leadership of the superintendent, 
the principal, and/or a committee of teachers. These objectives 


28 EVALUATING STUDENT PROGRESS 


should express the philosophy of the school and give direction to 
the efforts of teachers, students, and administrators, i.e., to act as 
directional signals for the teaching-learning process. 

Such general objectives would be of little value to the individual 
classroom teacher, however, in helping him judge the growth and 
progress of his students. His objectives must be specific in terms of 
actual behavior. He would have difficulty in appraising a student’s 
progress toward the general goal—'to grow in ability to think 
rationally, to express . . . thoughts clearly, and to read and listen 
with understanding." However, one evidence of progress toward this 
general goal is expressed in terms of the following functional or be- 
havioral outcome in Freshman English: “The student is able to com- 
pose sentences, paragraphs, and written reports." This behavior the 
teacher can measure. 

In step 3 of the evaluation procedure the teacher identifies or ar- 
ranges situations in which the student has an opportunity to display 
the behavior called for in the objectives established in step 2. In the 
case of the functional objective stated. above, step 2, such a situation 
might occur in an English class or in any class where the student 
would be required to express himself in writing. In appraising growth 
toward other objectives, the teacher might look for situations in 
which he could find evidence of proper, sound attitudes, skill in 
music, appreciation of art or literature, and so on. 

Step 4 involves the selection or construction of instruments, de- 
vices, or techniques for obtaining evidence, objective or subjective, 
concerning the amount of progress made by the student in the direc- 
tion of a specific goal. In the case of the English objective cited 
above, the teacher might select a standardized test in English compo- 
sition or grammar, or he might decide to appraise the student's prog- 
ress by having him write a theme or by having him rephrase poorly 
written statements or sentences, 

The task of selecting an instrument to measure progress toward 
this goal (step 2) may be relatively simple and easy. At times, how- 
ever, adequate instruments with which to measure growth in some 
areas of human development are not available. Therefore, to obtain 
any appraisal of progress, we are forced to rely upon subjective 
methods and techniques even though quantitative data form the most 
desirable basis for sound evaluation. We must not omit from our 


WHEN, WHAT, WHO, WHERE, AND HOW 29 


lists of goals and objectives certain behavioral outcomes simply be- 
cause there is, at present, no test, device, or technique with which 
to measure them. Perhaps next week, next year, or ten years from now 
there will be such an instrument. In the meantime we are alert to the 
fact that we do not have a complete picture of the student’s growth. 
The importance of the teacher in the development of appropriate 
evaluation instruments cannot be overemphasized, for he is closer 
to the actual classroom situation than is the test specialist. No mat- 
ter what technique of appraisal is designed or selected, it must be 
subjected to the tests of validity, reliability, objectivity, and usabil- 
ity to determine its value. f 

The administering of the instrument by which the student’s prog- 
ress is to be measured is probably the simplest step in the total 
evaluative process. Most of the formal, standardized tests and de- 
vices are relatively easy to use by merely following closely the 
manual of directions. The less formal, teacher-made measuring 
instruments and the subjective techniques (observation, anecdotes, 
sociodrama, written compositions, and the like) depend largely upon 
the ingenuity, skill, and understanding of the teacher, since only gen- 
eral instructions concerning their use are available. 

The final step in the evaluation procedure involves a high degree 
of subjectivity. It is the step in which meaning is given to data or 
evidence, whether it be quantitative, based upon a score of some 
kind, or qualitative, derived from some subjective technique. As an 
example, let us turn again to the behavior outcome of Freshman 
English : “The student knows how to compose sentences, paragraphs, 
and written reports.” 

Let us assume that the teacher administered an informal test to 
determine how well the students in his class had mastered this skill. 
On the test Jane received a score of 92, Tom 78, and Bill 83. These 
Scores are meaningless unless we know the answers to the following 
questions : 


1. What was the highest score possible on this test? 

2. What were the highest and lowest scores made by students on the 
test? 

3. What was the previous status of these students with respect to 
this objective? 

4. What was the average score of the class? 


30 EVALUATING STUDENT PROGRESS 


5. What are the mental capabilities of Jane, Bill, and Tom? 
6. What are these students’ attitudes, physical conditions, interests, 
personal goals, and desires? 


When we obtain the answers to these questions, we can give mean- 
ing to the scores and, therefore, evaluate the growth or progress made 
by each student in the direction of the given objective. Without this 
information the scores remain impersonal statistics, meaningless 
numbers. 


CHAPTER 
3 


Identifying Educational Outcomes 


TEACHERS MUST constantly raise questions concerning the core of the 
learning situation—purpose. Why is it necessary for all students in 
algebra to cover the material through page 210 in the textbook? Of 
what value is it to students to diagram sentences? Is it important 
for students to memorize dates in American history between the 
passage of the Stamp Act and the framing of the American Constitu- 
tion? Why should students be expected to know how to read a news- 
paper? Of what value is it to students to know the chemical formula 
for sugar? Is it important for students to know the leading imports 
and exports of Chile? Why should students have sex education? Why 
should students know how to outline? The number of questions 
that teachers can ask about their areas is limited only by the imagi- 
nation that they bring to the problem. Basic to all the questions is 
the need for an understanding of what we are doing in our class- 
. rooms and why. To achieve this understanding it is necessary for 
teachers to identify, as specifically as possible, the objectives that 
the students should attain. 

As teachers use textbooks and well-developed curriculum guides, 
they seldom seem to have the opportunity to determine why they are 
teaching what they are teaching, or what they are actually ac- 
complishing by their teaching. The ends of education become a num- 
ber of pages in a textbook, the completion of projects recommended 
in a curriculum guide, or the passing of assignments that will en- 
able students to be ready for the next teacher in the next grade. We 


32 EVALUATING STUDENT PROGRESS 


must, however, continuously remind ourselves that textbooks, 
projects, assignments, and discussions are not ends in themselves 
but the means to the ends. Teachers confronted with the task of 
providing learning experiences for the student must assist in de- 
termining the outcomes of education. 


DETERMINING THE OVER-ALL OBJECTIVES OF EDUCATION 


Present-day objectives of education stem from five major sources: 
(1) studies of our present society, (2) studies of the learner, (3) 
studies by subject specialists, (4) studies and discourses by philos- 
ophers, and (5) studies by psychologists. Studies in each of these 
areas provide information which in turn is assimilated by various 
groups, such as professional organizations, university professors, 
specialists in state departments of education, and committees of 
teachers and administrators in local school districts. The assimilated 
materials find their way into courses of study, curriculum guides, 
courses in education, textbooks, discussions at conferences, and 
periodical articles. School systems, through study committees or 
administrative dictum, then begin the process of translating the 
general statements into the day to day reality of the science teacher, 
the core teacher, the math teacher, and the social studies teacher. 
The process of evolving educational objectives is presented diagra- 
matically in Figure 1. 

It is difficult to identify all of the basic research which has influ- 
enced education. However, some of the studies that have had influ- 
ence are known. Caroline B. Zachary has made extensive studies of 
the problems of the adolescent. These and similar studies have re- 
sulted in an increased awareness of the impact of emotional develop- 
ment upon the adolescent. Howard M. Bell's study of out-of-school 
youth revealed the importance of economic security for adolescents. 
This study tended to focus the school's attention upon the need for 
vocational education. Kurt Lewin and other social psychologists 
have provided educators with a theory, as well as with case studies, 
of social impact and its consequences upon adolescent behavior. The 
attitudes and opinions of students have been explored extensively by 
H. H. Remmers and his findings have influenced social studies teach- 
ers and others to re-examine the basic premises upon which existing 
social studies sequences are developed. Hilda Taba, Bruno Bettel- 


IDENTIFYING EDUCATIONAL OUTCOMES 33 


heim, Gordon Allport, Robert Havighurst, and others have studied 
such varied problems as moral beliefs, ethnic tolerance, religious 
views, and the ideal self. Each of these studies, as well as many 
others, have helped educators develop a better perspective of school 
youth. 


TEACHERS 


PSYCHOLOGICAL 
STUDIES 


Ps d 
Fey go vuota 


No, ow 
ü 

VIDUAL scHoor, SY 
Fig. 1. Evolution of educational objectives 


Studies of society have direct implications for the schools. Ac- 
cording to Ralph Tyler, director of the Institute of Behavioral 
Sciences: 


The variety of ways by which information regarding activities, prob- 
lems, and needs of contemporary life may be obtained is sometimes con- 
fusing. During the past twenty-five years hundreds of investigations 
have been made of contemporary life with a view to inferring educa- 
tional objectives. These have involved observations of behavior, analyses 
of newspapers, of magazine articles, of the ideas of frontier thinkers 
about the important problems of the day, studies of communities in soci- 
ological surveys as Lynd’s volume on Middletown, or the Warner series 


34 EVALUATING STUDENT PROGRESS 


on Yankee City, activity analyses of various kind of individual activities 
as well as job analyses for a variety of vocations. Because the possible 
materials for analysis are so numerous and the possible methods of in- 
vestigation are so varied, it becomes important to recognize again that 
analyses of contemporary life are possible at several levels. In the first 
place, some analyses of contemporary life are national in scope if not 
international, and do not need to be repeated by every school group 
working upon the curriculum. Data are already available to throw a 
good deal of light upon the possible objectives in the field of national 
and international affairs, data indicating critical, social, political and eco- 
nomic problems. There are also data in the general areas relating to 
music, the arts, and aesthetic life. 


A good many courses have been built upon analysis of life outside 
of school, The well-known Rugg series of social-studies books was de- 
veloped from an analysis made of contemporary critical social problems 
as indicated by the studies made of so-called frontier thinkers, that is, 
leaders in the social science field. A number of language art series of 
texts in use in the schools were made by making an analysis of the errors 
people of today commonly make in language usage. 


Increasingly, the community schools in the South are basing much of 
their curriculum material upon analyses of community needs, with spe- 
cial reference to better utilization of natural resources, and more adequate 
development of human resources as revealed by community surveys. 
Studies of contemporary life provide a prolific source of information 
for suggestions regarding objectives.’ 

It can readily be seen that studies of society, whether made by 
sociologists or teachers, have a direct application to the framing of 
educational objectives. 


GENERAL STATEMENTS OF EDUCATIONAL OBJECTIVES 


While a complete study of the ultimate sources of the objectives 
of education would be highly beneficial, the teacher-evaluator gen- 
erally begins with some understanding of the statements of objec- 
tives which have been derived from various sources by special com- 


1 Ralph W. Tyler, Basic Principles of Curriculum and Instruction (Chicago: Uni- 
versity of Chicago Press, 1947), pp. 8-13. 


IDENTIFYING EDUCATIONAL OUTCOMES 35 


mittees, textbook authors, and professors of education. Two such 
lists are included here to assist the teacher-evaluator in recalling 
some of the numerous statements of objectives that have been de- 
rived in the past. Perhaps one of the most significant of such state- 
ments in recent years came from the Educational Policies Commis- 
sion and its listing of the “Imperative Needs of Youth.” 


1. 


10. 


All youth need to develop salable skills and those understandings 
and attitudes that make the worker an intelligent and productive 
participant in economic life. To this end, most youth need super- 
vised work experience as well as education in the skills and 
knowledge of their occupations. 

All youth need to develop and maintain good health and physical 
fitness. 

All youth need to understand the rights and duties of the citizen 
of a democratic society, and to be diligent and competent in the 
performance of their obligations as members of the community 
and citizens of the state and nation. 

All youth need to understand the significance of the family for 
the individual and society and the conditions conducive to suc- 
cessful family life. 

All youth need to know how to purchase and use goods and serv- 
ices intelligently, understanding both the values received by the 
consumer and the economic consequences of their acts. 

All youth need to understand the methods of science, the influ- 
ence of science on human life, and the main scientific facts con- 
cerning the nature of the world and of man. 

All youth need opportunities to develop their capacities to appre- 
ciate beauty, in literature, art, music, and nature. 

All youth need to be able to use their leisure time well and to 
budget it wisely, balancing activities that yield satisfactions to 
the individual with those that are socially useful. 

All youth need to develop respect for other persons, to grow in 
their insight into ethical values and principles and to be able to 
live and work cooperatively with others. 

All youth need to grow in their ability to think rationally, to ex- 
press their thoughts clearly, and to read and listen with under- 


standing.” 


? Education for All American Youth: A Further Look (Washington: Educational 
Policies Commission, 1952), p. 216. This is a revision of the earlier Education for All 
American Youth published in 1944. 


36 EVALUATING STUDENT PROGRESS 


In 1955, delegates from the forty-eight states participated in the 
White House Conference on Education. This conference, represent- 
ing, in the main, citizens and not professional educators, drew up 
a list of what the public schools should accomplish that is very 
similar to the preceding list. The conference’s statement of what 
the schools should accomplish follows: 


1. A general education as good as, or better than, that offered in 
the past, with increased emphasis on the physical and social 
sciences. 

Programs designed to develop patriotism and good citizenship. 
Programs designed: to foster moral, ethical, and spiritual values. 

4. Vocational education tailored to the abilities of each pupil and to 
the needs of the community and Nation. 

5. Courses designed to teach domestic skills. 

6. Training in leisure-time activities such as music, dancing, avoca- 
tional reading, and hobbies. 

7. A variety of health services for all children, including both physi- 
cal and dental inspections, and instruction aimed at bettering 
health knowledge and habits. ' 

8. Speech treatment for children with speech or reading difficulties 
and other handicaps. 

9. Physical education, ranging from systematic exercises, physical 
therapy, and intramural sports, to interscholastic athletic com- 
petition. 

10. Instruction to meet the needs of the abler students. 

11. Programs designed to acquaint students with countries other 
than their own in an effort to help them understand the problems 
America faces in international relations. 

12. Programs designed to foster mental health. 

13. Programs designed to foster wholesome family life. 

14. Organized recreational and social activities. 

15. Courses designed to promote safety. These include instruction in 
driving automobiles, swimming, civil defense, etc.* 

Statements of educational objectives are often generalities with 
little or no meaning to the student preparing to become a teacher, to 
the neophyte teacher, or, quite often, to the mature, experienced 
teacher. In part, much of the confusion exists because of the indi- 
vidual’s failure— 


on 


3 Ful] Report, The Committee for the White House Conference on Education, 
(Washington, D.C.: U. S. Government Printing Office, 1956), pp. 8-9. 


IDENTIFYING EDUCATIONAL OUTCOMES 37 


1.. To understand clearly the relationships between what takes place 

in the classroom and the general purposes of education. 

To understand the continuous nature of education. 

3. To understand that each learning experience may influence the 
learning of a multiplicity of behaviors. 

4. To understand the values to be gained by carefully identifying 
teaching objectives. 


If this confusion can be minimized it should promote better teaching 
and enhance the chances for the development of effective evaluation. 


N 


RELATING CLASSROOM OBJECTIVES TO GENERAL OBJECTIVES 


Teachers need to understand the relationship between what takes 
place in:the classroom and the general purposes of education. 
Purposes of education are not static and unchangeable, for, as 

societies change, the purposes of the schools change. The question of 
whether the schools change society or society changes the schools is 
very much like deciding the question of which came first, the chicken 
or the egg. A public school system could not endure if it did not 
satisfy the needs of a vast majority of the population, and there is 
ample evidence to substantiate the belief that our schools have influ- 
enced our society. The classroom teacher must visualize as best he 
can how the work done in the classroom contributes to the enlighten- 
ment of individuals and how such enlightenment affects the over-all 
functioning of society. The specific objectives shown in Chart A are 
designed to assist the student in a science class to achieve partially 
the all-school objectives, because the English, mathematics, home 
economics, social studies, shop and physical education classes all 
contribute to this process. The all-school objectives are, in turn, 
designed to assist the student to achieve the lofty goals described 
in the chart. 

The relationships between the work of the teacher and the over-all 
purposes of education, as shown in Chart A, should be direct and 
understandable. In this way there would exist general agreement as 
to the over-all purposes of education in our American society; the 
all-school objectives would be directly implied from the over-all 
purposes, and the specific classroom objectives would in turn be 
directly implied from the all-school objectives. It would follow then 
that the work done in individual classrooms would contribute di- 


EVALUATING STUDENT PROGRESS 


Chart A. Relationship of Over-all Purpose of Education 
to All-School Objectives 
to Specific Classroom Objectives 


OVER-ALL PURPOSES OF EDUCATION 


In an ideal sense, education should produce the well-rounded man. 
It should enlarge the ability to think and the capacity for thought. It 
should be helpful in creating constructive attitudes—both on an in- 
dividual and on a group level. It should impart basic and essential 
general knowledge for balanced living, and basic and essential 
knowledge for specific careers. It should develop ethical values. It 
should furnish the individual with the necessary intellectual, moral, 
and technical clothing for a presentable appearance in the world 
community. 


ALL-SCHOOL OBJECTIVES 


In working toward attainment of all-school objectives the student— 


I 


~ 


Assumes responsibility for personal growth: (1) by making effec- 
tive use of intellectual ability, (2) by making an effort to de- 
velop character as exemplified in such qualities as personal 
integrity, dependability, perseverance, courtesy, self-control, and 
self-reliance, (3) by accepting constructive criticism. 


. Exercises appropriate emotional control. 
. Assists in orderly and effective functioning of school groups: 


(1) by abiding by school rules and customs, (2) by accepting 
group decisions, (3) by showing sensitivity to the needs of indi- 
viduals and groups, (4) by performing official duties effectively, 
(5) by respecting property, (6) by serving voluntarily as a leader 
and/or follower. 


. Practices desirable habits of health and safety. 
. Demonstrates appropriate self-direction and persistence: (1) by 


recognizing points within an area on which he needs improve- 
ment, (2) by working toward improvement according to ability, 
experience, and available resources. 

Uses time wisely: (1) by planning effective use of available time, 


9o 


10. 


IDENTIFYING EDUCATIONAL OUTCOMES 39 


(2) by bringing required materials, (3) by assembling equipment 
before starting a project, (4) by starting work promptly, (5) 
by meeting deadlines set by himself or others, (6) by cleaning 
up and putting equipment away. 

Shows ability to listen effectively. 

Shows ability to read effectively. 

Shows ability to speak effectively: (1) by providing adequate and 
accurate content, (2) by observing correct pronunciation, enun- 
ciation, speed, and tone, (3) by organizing thought. 

Shows ability to write effectively: (1) by providing adequate and 
accurate content, (2) by organizing thought, (3) by observing 
conventions in spelling, punctuation, usage, and handwriting. 


T 
| 


SPECIFIC CLASSROOM OBJECTIVES 
HEALTH AND NUTRITION UNIT 


OBJECTIVES 


A. Knowledge 


i 


Knowledge of how the body operates 


2. Knowledge of body mechanics and posture 

3. Knowledge of the effect of disease upon the human body; rickets, 
rheumatism, fever, etc. 

4. Knowledge of the effect of alcohol and drugs upon the human 
body 

5. Knowledge of everyday hygiene; attractive skin, hair, and teeth 

B. Skills 
1. Skillin proper care of the body; cleanliness, proper diet, sufficient 


2. 


neo 


sleep, sufficient rest, knowing the limits of work 

Skill in maintaining body efficiency; training the body to per- 
form to a maximum degree of efficiency with as little effort as pos- 
sible, through proper conditioning 

Skill in first aid and prevention of accidents 

Skill in selection of proper food nutrients 

Skill in selecting athletic equipment to suit the individual’s physi- 
cal structure 

Skill in purchasing equipment or material for the comfort of the 
body, like insoles, rubdown liniment, etc. 


40 EVALUATING STUDENT PROGRESS 


C. Interests 


1. Interest in healthful living through a sound understanding of 
bodily needs and functions or the reward of having a healthy: 
bod 

Zs intera in new advancement against major diseases (drugs, de- 

vices), cures of cancer, TB, leprosy, and polio 

Interest in the people who produce necessary food products 

Interest in first aid and the prevention of accidents 

Tnterest in articles or books containing new information on health 

and new discoveries in the medical world 


t 


D. Appreciations 

1. Appreciation of the human body and its capabilities for everyday 
living 

2. Appreciation of new scientific improvements which safeguard our 
health 

3. Appreciation of food production and our dependence upon those 
involved in food output 

4. Appreciation of how a balanced life can lead to greater enjoyment 

5. Appreciation of people who work hard to find new means or meth- 
ods of combating disease 


E. Attitudes 


1. Attitude that personal hygiene knowledge is essential to a happy 
and well-adjusted life 

2. Attitude that eating proper food is essential to good health 

3. Attitude that it is necessary to consult a physician in case of dis- 
ease or injury 

4. Attitude that communities can be improved through health cam- 
paigns 


F. Critical Thinking 
1. How to sift the good from the bad in medical advertisements 
. Wise choices in daily living (food, sleep, activity) 


2 
3. Wise occupational decision; consideration of physical capabilities 
4. Integration of hygiene principles with the process of living 


rectly to the all-school objectives, and the attainment of the all- 
school objectives would enable the individual to reach the over-all 
purposes of education. That this ideal is not reached in actual prac- 
tice can be attributed to a number of factors: 


1. While we agree in general as to the over-all purposes of educa- 


tion, many teachers have not given enough consideration to this 
problem. 


IDENTIFYING EDUCATIONAL OUTCOMES 41 


2. Many school systems have not attempted to describe the all-school 
objectives they are seeking to attain in concrete terms. 

3. General agreement does not exist as to how the over-all purposes 
of education should be translated into all-school objectives. 

4. Many teachers have not seen the relationship between what they 
are doing in their classrooms and the stated or unstated all-school 
objectives. 


‘As greater understanding of the relationship between the work of 
the teacher and the ultimate goals of education develops, these prob- 
lems will fall into proper perspective and the results will prove bene- 
ficial to the teacher seeking to bring purpose to his efforts. 


Teachers need to understand the continuous nature of education. 

When the five-year old enters kindergarten, he has a large speak- 
ing vocabulary; he can run, jump, climb, draw; he has acquired the 
ability to dress himself; he is developing a cultural pattern of right 
and wrong; and he is rapidly becoming ready to read, write, and 
engage in other complex mental and motor activities. As the young- 
ster grows chronologically, he also grows mentally, physically, so- 
cially, and emotionally. His limited range of knowledge at five is ex- 
panded by his formalized school activities and the activities of his 
family, friends, and community. By the time he completes his high 
School education he is fortified with a body of knowledge that ranges 
from algebra to zoology. Knowledge is not the only aspect of behavior 
that grows, for the individual from kindergarten through high school 
graduation develops skills, attitudes, interests, and appreciations, 
and learns to think critically in many different areas. It is necessary 
to emphasize at this point that all individuals do not grow at the 
same rate nor do all individuals grow to the same level. Just as some 
individuals grow to be 6’ 3” and some stop at 5’ 4", some students 
gain a great deal of knowledge and ability to think critically while 
others gain very little in these areas. 

As shown in Figure 2, the behaviors that have been classified as 
knowledge and understanding, skills, attitudes, interests, apprecia- 
tions, and critical thinking form a continuum that reaches from the 
kindergarten through the twelfth grade. This continuum actually be- 
gins at birth and extends to the point at which each individual stops 
growing. The behaviors shown in the figure are developed in the 


42 EVALUATING STUDENT PROGRESS 


schools through many different curriculum patterns, but they all con- 
tain to some extent the areas of (1) physical development, health, 
body care, (2) individual social and emotional development, (3) ethi- 
cal behavior, standards, values, (4) social relations, (5) the social 
world, (6) the physical world, (7) esthetic development, (8) com- 
munication, and (9) quantitative relationships. At the secondary 
school level these areas are generally identified by course designations, 
such as physical education, social studies, language arts, core, mathe- 
matics, art, music, Latin, home economics, and machine shop. Each 


EE EE 


E AN 


Lae d 
> E 


| 
| 
| 
i 
| 
f 
i 
i 
i 


QUANTITATIVE RELATIONSHIPS 


Fig. 2. Continuum of behaviors 


Tenen Twelfth Grode 


Kindergarten 


teacher at the high school level, seeking to assist students to progress, 
must take into account the maturation and readiness of the learners. 
For example, the teacher of algebra cannot successfully teach stu- 
dents how to solve quadratic equations unless the students have a 
grasp of the mechanics of solving simple equations. Each teacher 
works with students to help them grow, and the cumulative efforts 
of teachers from kindergarten through the twelfth grade are respon- 


IDENTIFYING EDUCATIONAL OUTCOMES 43 


sible for the success with which individuals attain the over-all ob- 
jectives of education. 


Teachers must understand that a single learning experience may 
influence the development of many behaviors. 
Another reason why objectives of education are often considered 
by teachers to be glittering generalities is that they fail to see that 


Specific Teaching Objective: To acquire the ability to use the comma 
properly. 


POSSIBLE 
LEARNING EXPERIENCES 


. Sentences are placed on the 
board without comma punctu- 
ation and the students discuss 
the meanings of each sentence. 
. Errors in comma punctuation 
are taken from student themes 
and the students complete a 
correction exercise. 

. Students analyze current peri- 
odical articles and draw up a 
list of how commas are used by 
authors. 

. The film, “——,” is shown and 
discussed. 

Exercises containing correct 
and incorrect usage are com- 
pleted by the students. 


. The students read from a va- 


riety of grammar textbooks and 
compile a list of generally ac- 
cepted current suggestions on 
the use of the comma. 

. Special drill exercises are given 
as needed. 


POTENTIAL 
RESULTANT BEHAVIORS 


. The students acquire an under- 


standing of why comma punc- 
tuation is important. 


. The students develop skill in 


using commas. 


. The students develop an in- 


terest in using correct comma 
form. 


. The students acquire the ability 


to appraise written materials. 


. The students develop an appre- 


ciation of the skills needed to 
write effectively. 


. The students develop the 


proper attitude toward correct- 
ness in writing. 


. The students develop their skill 


in locating information about 
a problem through the use of 
many reference sources. 


Fig. 3. Relationship of a specific objective to learning experiences 
and resultant behaviors 


44 EVALUATING STUDENT PROGRESS 


they cannot develop the skill of diagraming sentences without af- 
fecting the student’s interest in, attitude toward, and appreciation 
of, writing. Many students have learned factual information about 
the history of the United States while developing an intense dislike 
for the social science field. On the other hand, students have ac- 
quired vocational and avocational interests as a direct result of the 
way a particular teacher has taught a science class. In teaching, it is 
not possible to teach knowledge alone, nor is it possible to develop 
positive attitudes without regard for knowledge. 

In the example shown in Figure 3, the teacher of English has set 
for herself the specific objective of having her students acquire the 
ability to use the comma properly. As a result of this decision a 
series of learning situations is introduced to assist the students to 
attain the objective. In the course of the evaluation process the 
teacher must recognize that the students are not only making prog- 
ress toward a specific objective, i.e., the ability to use the comma 
properly, but that skills, interests, appreciations, and thinking are 
also being influenced. It is in this way that all of the things done by 
the teacher tend to influence all aspects of student behavior. Be- 
cause of this factor, no teacher can teach for knowledge alone and 
disregard the more intangible aspects of the educational process. 


Teachers must understand the values to be gained in carefully de- 
fining classroom objectives. 

The objectives presented earlier are typical of a large number of 
such statements that have been derived to assist schools to focus 
their attention upon the ultimate goals of education rather than upon 
immediate day to day operations. It is essential that these ultimate 
goals be understood by each teacher, but perhaps of more importance 
that each teacher know how to translate the generalizations into 
specific teaching objectives. It is important that objectives to be 
achieved by classroom teachers be identified in specific terms. These 
classroom objectives should— 


Become a part of the total educative process. 

Serve as a basis for directing learning experiences. 

Serve as the basis for evaluating student progress. 

Serve as a basis for appraising teaching-learning situations. 
Serve as a basis for reporting to parents. 


naen 


IDENTIFYING EDUCATIONAL OUTCOMES 45 


6. Serve as a basis for the guidance process. 
7. Serve as a basis for promoting the teacher’s sense of security. 
8. Be direct, while generalized objectives are indirect. 


1. The classroom objectives become a part of the total educative 
process, One of the basic problems facing secondary schools today is 
segmentation of effort. Many teachers of English teach the skills of 
writing divorced from a report to a science or social studies class, 
while teachers of geometry teach logic divorced from science or 
problems-in-democracy classes, and teachers of music teach harmony 
divorced from the art and home mechanics classes. If integration or 
totality of purpose is to take place in the schools, it will, in part, 
come from.a mutual understanding of teaching objectives. In this 
way the objectives of the English teacher are known to the science 
teacher and in turn the objectives of both of these teachers are re- 
lated to the over-all objective of the school. Continuity of the educa- 
tional process is difficult to achieve unless the objectives for the var- 
ious parts of the continuum are known and coordinated. 

2. The classroom objectives serve as a basis for directing the learn- 
ing experience. In the history classroom the identification of an ob- 
jective, such as “to promote an increasing interest in the study 


of history,” might very well direct the teacher away from assigning 


a project involving the memorizing the names of the American presi- 
dents to viewing a movie covering the life of an American president, 
or to listening to tape recordings of the radio program, “Mr. Presi- 
dent,” or to a television report on the inauguration, or to reading a 
biography of one of the presidents. The objectives must dictate the 
kind of learning situations that are to take place; the learning situa- 
tions should not dictate the objectives to be achieved. 

3. The classroom objectives serve as the basis for evaluating stu- 
dent progress. In the classroom, the teacher has as an objective, 
“development of an understanding of important facts and principles.” 
Techniques will have to be devised that will enable him to learn 
whether or not the student has achieved an understanding of the “im- 
portant facts and principles” of a particular unit. It might be pos- 
sible to measure the student’s understanding at the beginning of a 
unit and again at the end of the unit, with the gain that had taken 
place being some indication of the progress made by the student. 


46 EVALUATING STUDENT PROGRESS 


Many objectives of education cannot be measured in terms of precise 
scores, but in all areas of education teachers should study a student’s 
progress in relation to specific objectives. 

4. The classroom objectives serve as a basis for appraising the 
teaching-learning situation. Teachers are confronted with a variety 
of claims and counterclaims about the use of motion pictures, the 
role of educational television, the place of field trips, the importance 
of role-playing, the use of tape recorders, and even the place of the 
textbook in the classroom. There is little question that the many 
methods available to the alert teacher have real value at the ap- 
propriate time and place. What constitutes the appropriate time and 
place for using them is determined in a large measure by the ob- 
jectives that the teacher is seeking to achieve. 

5. The classroom objectives serve as a basis for reporting to par- 
ents. One of the responsibilities that the teacher cannot delegate to 
others involves the making of judgments about other individuals. 
Since intuitive judgments are not completely reliable, it becomes 
necessary to make appraisals in the light of the best evidence at our 
disposal. This demands the collection of as much evidence as possible 
about student attainment of objectives. For this reason, it becomes 
extremely important for classroom teachers to identify specific ob- 
jectives, and to judge and report progress in terms of these goals. 
The concern in recent years over the inadequacies of many of the re- 
port cards in use has led to experimentation with a variety of differ- 
ent systems. Out of this has come the conviction that a sound re- 
porting system must be built upon the objectives of the school as 
they are defined in individual classrooms. 

6. The classroom objectives serve as a basis for the guidance 
process. In the literature of education during the second quarter of 
the twentieth century the field of guidance became prominent. It 
might be expected that in the second half of the century the vast 
amount of literature that has been written about the importance, 
purpose, place, and techniques of guidance will find application in 
most of the school systems of the United States. The educator’s con- 
cern for guidance is certainly one of the milestones in the educational 
history of the twentieth century. For the individual teacher, the role 
of guidance is associated with the objectives of the school as they are 
understood by the individual teacher and as they are to be achieved 


IDENTIFYING EDUCATIONAL OUTCOMES 47 


in the classroom. If the objectives of the school are to be achieved, 
then the role of guidance is to assist the individual student to 
achieve those goals. 

7. The classroom objectives serve as a basis for promoting the 
teacher’s sense of security. The factory worker can measure success 
by the number of airplane parts he produces, the business executive 
can measure success by computing net profits, the lawyer can measure 
success in terms of cases won, the physician can measure success by 
the number of patients cured. But the teacher must often wait years 
after the student has left the classroom to determine whether or not 
the teaching efforts were successful. Some measure of satisfaction can 
come to the teacher, however, if the immediate goals are achieved. 
Psychological security is in part developed by achieving a measure 
of success in what one does, and this success must to some extent 
be evident. 

8. The classroom objectives are direct, while over-all objectives 
are indirect. As has been explained previously, it is necessary to 
know what the ultimate goals of education are, but only when these 
goals have been stated in terms relating directly to the students in a 
single classroom unit can they have any meaning for teachers, pupils, 
administrators, or parents. The abstractness of general statements 
of objectives needs to be translated into the specificity of teaching 
objectives if an effective teaching-learning situation is to be achieved. 


CHAPTER 
4 


Determination of Classroom Objectives 


THE DESIRED RESULT of the educative process in the secondary 
schools is the development of individuals capable of self-motivation 
who are able to assume active, intelligent, adult roles in society. To 
this end the secondary school assists'the individual to mature satis- 
factorily as an adolescent by helping him develop an appropriate 
set of human values, refine his vocational and avocational skills, 
establish a cultural frame of reference, and develop the skills of 
human relationships. The extent to which these objectives materialize 
will vary from individual to individual, and each teacher has a re- 
sponsibility for fostering the optimum development of the students 
in each and every classroom. 
There exists for the teacher-evaluator three definite problems: 


1. What specific behaviors should students be expected to attain? 
2, What is the best subject-matter content to use to foster the de- 


velopment of desirable behaviors and how can this content be best 
organized? 


3. What progress is the student making in light of the content being 
used and the behaviors being sought? 


DESCRIBING STUDENT BEHAVIORS 


To develop the ability to think critically, one must think critically 
about a specific topic, person, idea, or issue. Generalized concepts, 
such as critical thinking, interests, appreciations, skills, and atti- 


48 


agy 


DETERMINATION OF CLASSROOM OBJECTIVES 49 


tudes are best understood when described in specific terms. An ex- 
ample of how general objectives can be made more specific follows. 


Senior Year—Problems of Democracy Course 
The over-all objectives for this class are: 


1. The development of an expanding range of skill in studying con- 
temporary problems through reading, writing, speaking, and listen- 
ing. 

2. The acquisition of knowledge about contemporary, social, eco- 
nomic, and political problems. 

3. The development of realistic and critical thinking in regard to 
contemporary problems. 

4. The development of an appreciation of contemporary society. 

5. The development of successful group relationships through the use 
of democratic values and procedures in studying contemporary 


problems. 

6. The development of healthy social attitudes toward the democratic 
heritage and man’s struggle for human rights. 

7. The development of interests in the problems of contemporary 
society. 


Skills. The students at the Broadview School are equipped with 
many communication skills by the time they reach the senior level. 
At this stage, however, these skills are in need of refinement. Types 
of skills that are implied in the above statement of general ob- 
jectives are as follows: (1) using outlines, taking notes, organizing 
and summarizing materials; (2) spelling, punctuating, and using 
grammar correctly in writing; (3) reading for main points and skim- 
ming for general ideas; (4) participating in informal discussions as 
well as the more formal oral reports, panels, and round tables, and 
(5) developing techniques of listening that will promote discussions 
and the solution of problems. 

Knowledge. Acquiring knowledge involves more than memorization 
of the date on which the American Federation of Labor was founded. 
It is concerned with knowing facts, terms, trends, classifications, 
criteria, methods, principles, generalizations, or theories. The acqui- 
sition of knowledge is not considered a “storehouse” function but an 
ability to recognize, to recall, and to use information. 


50 EVALUATING STUDENT PROGRESS 


Critical thinking. The development of realistic and critical think- 
ing is a process which necessitates the use of concrete experiences to 
minimize the frequent use of “glittering generalities” by high school 
students. Critical thinking involves (1) the ability to understand 
and interpret an idea, a work, or a passage; (2) the ability to use 
principles, generalizations, ideas, or methods in new problem situa- 
tions; (3) the ability to determine the relationship between the parts 
of a work or passage; (4) the ability to organize ideas; (5) and the 
ability to appraise and judge a policy, a work, or a specific view- 
point in terms of criteria. 

Appreciation. Appreciation is a creative experience engaged in by 
the learner, It may be inferred that appreciation is a gratifying emo- 
tional response growing out of a satisfying experiencing of the 
beautiful in life. It is often thought to be a sense of value. It 
should not be considered an absolute quality, but a relative one. 
To appreciate is to use all the senses to gain satisfaction from 
experiences. 

Group relationships. Successful group relationships imply the 
ability to get along with others. This behavioral concept connotes 
willingness to listen to the opinions of others, willingness to discuss 
issues, not personalities, willingness to modify extreme views for the 
sake of a goal, willingness to assist others, willingness to sacrifice 
personal glory for group gain, and recognition of the role of human 
differences. This behavior must be an active and not a passive accept- 
ance of either peer standards or adult standards. 

Attitudes. The development of healthy social attitudes suggests 
that the students will have to think of subject-matter content in 
terms of its value to mankind. Learning experiences must be designed 
to help students see the social implications of the content areas. Since 
the content area is in the realm of contemporary social problems, it 
should be expected that attitudes will be evident in the way the 
individual expresses himself and puts his impressions into action. It 
must be recognized that development of attitudes may not result in 
immediate overt behavior, but social sensitivity to problems should 
evolve. 

Interest, Interest development implies that the student through 
his own initiative will seek information from newspapers, books, 
pamphlets, and television to increase his background and understand- 


DETERMINATION OF CLASSROOM OBJECTIVES * St 


ing of the content. It means a willingness to discuss issues of con- 
temporary life freely and intelligently. 

The interpretation of these seven over-all objectives enables the 
teacher and the student to understand in concrete terms the be- 
haviors that are expected to emerge in the classroom. The description 
above is not the only analysis of the behaviors that may be made. 
There have been many attempts to describe in exact terms the be- 
haviors which students should attain. 

One group of social studies teachers working in the area of general 
education sought to identify the characteristics of the behavior 
identified as “critical thinking.” Their analysis of the general ob- 
jective resulted in the following specific behavioral outcomes: 


1. To identify central issues. One of the basic skills in critical think- 
ing is the ability to identify the central issue or main theme. The 
thesis may be perfectly clear; it may be hidden in a mass of ver- 
biage; or it may be unstated. Until the student has identified the 
central issue, an analysis of the information cannot proceed on a 
sound basis. . 

2. To recognize underlying assumptions. An argument is always 
based upon certain assumptions. These assumptions may be gen- 
erally accepted; they may be subject to grave doubt; or they may 
be absolutely untenable. The validity of many arguments depends 
upon the validity of the assumptions upon which they are based. 
An individual whose analysis does not go beyond the argument 
and into the assumptions will seldom arrive at a truly satisfactory 
insight into any social science issue. 

3. To evaluate evidence or authority. 

a. To recognize stereotypes and cliches. Social science materials 
contain abundant illustrations of faulty thinking in the form of 
stereotypes and cliches. Everyone is familiar with the popular 
concepts of “the American clubwoman,” “the tired business- 
man,” “the absent-minded professor,” “100 percent American- 
ism,” and “the good old days.” Many people who accept these 
at face value may be victimized by skillful propaganda tech- 
niques. 

b. To recognize bias and emotional factors in a presentation. The 
validity of any presentation should depend only upon such 
factors as the soundness of its reasoning and its factual basis. 
Many presentations, however, neglect reason and fact and sub- 
stitute highly colored words or appeals to prejudice. This prac- 


52 


EVALUATING STUDENT PROGRESS 


tice is frequently an admission that there is very little sub- 
stance supporting the presentation. Since bias refers to opinions 
or attitudes based on prejudice and preconception rather than 
upon fact and reason, it bears no constant relation to truth 
and is as likely to be favorable as it is to be unfavorable. To 
detect bias is not to impute dishonesty, for many biases are un- 
conscious. Recognizing bias, conscious or not, is the important 
thing. Awareness of the part one’s own biases may play in the 
process of analysis and decision is also an important factor in 
critical thinking. 


. To distinguish between verifiable and unverifiable data. An 


early step in determining the verifiability of a proposition is the 
distinction between material which is of a factual or verifiable 
nature and that which is not. Sweeping generalizations, value 
judgments, beliefs, and opinions are usually unverifiable. Ma- 
terial of a factual nature, on the other hand, is capable of proof 
or disproof, although frequently the data necessary to verify 


it may not be available. 


. To distinguish between relevant and nonrelevant. 'To analyze 


social situations and problems adequately, an individual must 
be able to distinguish between those facts that have a bearing 
upon the solution and those that do not. One should ask, “Does 
this statement define, illustrate, or bear upon the problem?” 
This ability is less complex than the one which follows, because 
it does not require the individual to judge the degree of rele- 
vancy, but only to sort the aspects of a situation into those 
which do or do not have a bearing upon it. 


. To distinguish between essential and incidental. Those facts 


which are essential to a given situation are often confused with 
other facts which are present but are not a necessary part of 
that situation, Relevant data are not necessarily essential to an 
interpretation and may be of only secondary importance. 


. To recognize the adequacy of data. An appreciation of the con- 


nection between adequate facts and a valid conclusion is an 
essential ability in critical thinking. A judgment made on the 
basis of fragmentary evidence is likely to be of little value. In 
dealing with social issues, it is particularly important that judg- 
ments be based upon sufficient information. It is also important 
to be able to detect that significant data have been ignored or 
omitted, The omission may have been unintentional, but often 
the additional evidence has been purposely suppressed in order 


"fnt 


DETERMINATION OF CLASSROOM OBJECTIVES 53 


to strengthen the argument advanced. In many cases considera- 
tion of neglected material will destroy an argument completely. 

g. To determine whether facts support a generalization. Facts 
may be relevant, essential, and adequate but still not support 
a generalization. Furthermore, poorly selected facts occasion- 
ally contradict and seem to disprove a generalization. Also, in 
some cases the support furnished by one fact is stronger than 
that furnished by another. 

h. To check consistency. All arguments must be checked for inter- 
nal consistency. Identification of a major inconsistency may 
invalidate a presentation, and in any case an argument cannot 
be considered as a logical whole when it is based upon contra- 
dictory elements. If an argument withstands the test of internal 
consistency, it still must be submitted to a check for consistency 
with other known data. Having recognized the external con- 
sistency or inconsistency of the argument, one is ready to draw 
a conclusion. 

4. To draw warranted conclusions. The drawing of a warranted con- 
clusion involves making an inference. An inference is a truth or 
proposition drawn from another which is admitted or supposed to 
be true; a conclusion; a deduction. An individual needs to realize 
that certain facts not explicitly stated may be inferred as true or 
untrue. It is also important to realize the limitations of the infer- 
ences which can be made from given data. Many statements at 
first glance appear plausible, but if the inferences are properly 
drawn their meanings may change.’ 


One of the most significant attempts to describe educational ob- 
jectives has come from a committee of college and university ex- 
aminers who are attempting to develop a classification of educational 
objectives. In a recent report the committee indicated the need for, 
and the potential uses of, a taxonomy. 


. . . It is intended to provide for classification of the goals of our edu- 
cational system. It is expected to be of general help to all teachers, 
administrators, professional specialists, and research workers who deal 
with curricular and evaluation problems. It is especially intended to help 
them discuss these problems with greater precision. For example, some 
teachers believe their students should “really understand,” others desire 


1 Paul L. Dressel and Lewis B. Mayhew, General Education: Explorations in 
Evaluation (Washington, D.C.: American Council on Education, 1954), pp. 38-40. 


54 EVALUATING STUDENT PROGRESS 


their students to “internalize knowledge,” still others want their students 
to “grasp the core or essence” or “comprehend.” Do they all mean the 
same thing? Specifically, what does a student do who “really under- 
stands” which he does not do when he does not understand? Through 
reference to the taxonomy as a set of standard classifications, teachers 
should be able to define such nebulous terms as those given above. This 
should facilitate the exchange of information about their curricular de- 
velopments and evaluation devices. Such interchanges are frequently 
disappointing now because all too frequently what appears to be com- 
mon ground between schools disappears on closer examination of the 
descriptive terms being used.* 


The cognitive domain, which is the concern of the first handbook 


of this committee, includes those objectives which deal with the re- ` 


call or recognition of knowledge and the development of intellectual 
abilities and skill. An outline of the principal elements of this 
analysis follows. In the handbook each of these elements is described 
in considerable detail and implications for teachers and test makers 
are also considered.? 


KNOWLEDGE 


1.00 KNOWLEDGE 

Knowledge, as defined here, involves the recall of specifics and uni- 
versals, the recall of methods and processes, or the recall of a pattern, 
structure, or setting. For measurement purposes, the recall situation 
involves little more than bringing to mind the appropriate material. Al- 
though some alteration of the material may be required, this is a rela- 
tively minor part of the task. The knowledge objectives emphasize most 
the psychological processes of remembering. The process of relating is 
also involved in that a knowledge test situation requires the organization 
and reorganization of a problem such that it will furnish the appropriate 
signals and cues for the information and knowledge the individual pos- 
sesses. To use an analogy, if one thinks of the mind as a file, the problem 
in a knowledge test situation is that of finding in the problem or task 
the appropriate signals, cues, and clues which will most effectively bring 
out whatever knowledge is filed or stored. 


1.10 Knowledge of specifics 
1.11 Knowledge of terminology 
2 Taxonomy of Educational Objectives, ed. Benjamin S. Bloom (New York: 
Longmans, Green and Co., 1956), p. 1. 
3 Ibid., pp. 1-207. 


DETERMINATION OF CLASSROOM OBJECTIVES 55 


1.12 Knowledge of specific facts 

1.20 Knowledge of ways and means of dealing with specifics 
1.21 Knowledge of conventions 

1.22 Knowledge of trends and sequences 

1.23 Knowledge of classifications and categories 

1.24 Knowledge of criteria 

1.25 Knowledge of methodology à 

1.30 Knowledge of the universals and abstractions in a field 
1.31 Knowledge of principles and generalizations 

1.32 Knowledge of theories and structure 


INTELLECTUAL ABILITIES AND SKILLS 


Abilities and skills refer to organized modes of operation and general- 
ized techniques for dealing with materials and problems. The materials 
and problems may be of such a nature that little or no specialized and 
technical information is required. Such information as is required can 
be assumed to be part of the individual’s general fund of knowledge. 
Other problems may require specialized and technical information at a 
rather high level such that specific knowledge and skill in dealing with 
the problem and the materials are required. The abilities and skills 
objectives emphasize the mental process of organizing and reorganizing 
material to achieve a particular purpose. The materials may be given 
or remembered. 


2.00 COMPREHENSION 
This represents the lowest level of understanding. It refers to a type 


of understanding or apprehension such that the individual knows what 
is being communicated and can make use of the material or idea being 
communicated without necessarily relating it to other material or seeing 
its fullest implications. 


2.10 Translation 
2.20 Interpretation 
2.30 Extrapolation 


3.00 APPLICATION 

The use of abstractions in particular and concrete situations. The ab- 
stractions may be in the form of general ideas, rules of procedures, or 
generalized methods. The abstractions may also be technical principles, 
ideas, and theories which must be remembered and applied. 


4.00 ANALYSIS 
The breakdown of a communication into its constituent elements or 
parts such that the relative hierarchy of ideas is made clear and/or the 


56 EVALUATING STUDENT PROGRESS 


relations between the ideas expressed are made explicit. Such analyses 
are intended to clarify the communication, to indicate how the com- 
munication is organized and the way in which it manages to convey its 
effects, as well as its basis and arrangement. 

4.10 Analysis of elements 


4.20 Analysis of relationships 
4.30 Analysis of organizational principles 


5.00 SYNTHESIS 
The putting together of elements and parts so as to form a whole. This 

involves the process of working with pieces, parts, elements, and so on, 
and arranging and combining them in such a way as to constitute a 
pattern or structure not clearly there before. 

5.10 Production of a unique communication 

5.20 Production of a plan or proposed set of operations 

5.30 Derivation of a set of abstract relations 


6.00 EVALUATION 
Judgments about the value of material and methods for given pur- 
poses. Quantitative and qualitative judgments about the extent to which 
material and methods satisfy criteria. Use of standard of appraisal. The 
criteria may be those determined by the student or those which are given 
to him. 
6.10 Judgments in terms of internal evidence 
6.20 Judgments in terms of external criteria 
The teacher or teachers confronted with the task of identifying the 
behaviors inherent in the objectives they are trying to achieve should 
do so in a language which is understandable to them and to their 
students. The teachers must ask themselves: “What does a student 
do who can think critically ?" “What does it mean for a student to 
‘understand’?” “What is the difference in behavior between the 
student who is interested and one who isn’t?” “How does a student 
show that he is ‘appreciating’ ?” “What are the overt expressions of 
attitudes?" and “What characterizes a student who has skill ?” 


DETERMINING THE CONTENT TO BE USED 
IN THE CLASSROOM 


If that which is taught or learned in a classroom is of small conse- 
quence, it might be reasonable to expect each teacher in a school to 
construct his own curriculum without concern for what has been 


DETERMINATION OF CLASSROOM OBJECTIVES 57 


taught or what might be taught. Greek mythology could become the 
subject matter in one classroom, the art of deep-sea diving might 
occupy the attention of another, and still another might develop units 
based upon contract bridge. Obviously, it does matter what is taught 
and learned in a school, and a completely free and emerging cur- 
riculum is a fictionalized version of what actually does take place in 
the schools. The major advocates of flexible program planning also 
recognize the necessity of establishing a structure in which flexibility 
is possible. 

While the content to be used in the classroom may be for the most 
part structured by state requirements, courses of study, textbooks, 
and physical resources, there exist for the creative teacher many 
opportunities to select and organize a body of content. For example, 
what can be done to select and organize the content of a course in 
American history ? 

Pattern A. The American history teacher, after a study of the text- 
books available, can decide to use a chronological approach in which 
all aspects of historical development are held together by a time 
sequence. The teacher might propose to develop the following areas : 


I. European backgrounds of American History 
II. Exploration and Colonization 
III. Struggle for a New Continent 
IV. Conflict and Revolution 
V. A New Nation Is Born : 
VI. Expanding Westward—Peace and Conflict 
VII. Civil War and Reconstruction 
VIII. Social, Political, and Economic Challenges 
IX. World War I and International Relationships 
X. Between the Wars 
XI. World Power and the Future 


Pattern B. A second approach can also be based upon the chron- 
ological evolution of events, but a different organizational pattern 
emphasizes certain unique elements in the growth of the nation. 


I. Geographic History of the United States 
II. Social and Cultural History 
III. Political Developments 
IV. Economic History of the United States 
V. United States and Foreign Relations 


58 EVALUATING STUDENT PROGRESS 


Pattern C. Another approach may arise if the group selects a num- 
ber of current problems in American life and traces their evolutionary 
pattern. Such a class might come up with the following areas: 


I. How Did the Concept of the Free Man Develop in the United 
States? 
II. How did Political Parties Grow in the United States? 
III. What Has Been the Involvement of the United States in Inter- 
national Affairs? 
IV. How Did Big Business Develop? How Did Big Labor Unions 
Develop? 
V. What Has Been the Place of the Arts in American Life? 
VI. How and Why Has the Population of the United States Grown? 


Pattern D. Still another approach can be provided if the group 
decides to study the life and times of several great American presi- 
dents. Such an organization of American history might result in the 
following outline of areas: 

I. The Life and Times of Washington 
II. The Life and Times of Jackson 
III. The Life and Times of Lincoln 
IV. The Life and Times of Theodore Roosevelt 
V. The Life and Times of Wilson 
VI. The Life and Times of Franklin D. Roosevelt 
VII. The Life and Times of Eisenhower 


Pattern E. Another content pattern emerges if the teacher attempts 
to use the experiences of the students as a base upon which the pat- 
tern is developed. In such a pattern the areas to be investigated 
might vary from community to community. 

I. What Significant Events in American History Took Place in 
Our Community? 
II. How Did Political Events in American History Influence Out 
Community? 
III. What Is the Significance of the Bill of Rights for Our Com- 
munity? 
IV. How Has the History of the United States in International 
Affairs Influenced Our Community? 
V. How Have the People Changed in the United States Since Its 
Early Days? 
VI. What Is the Significance of American History and the Future 
of Our Community? 


DETERMINATION OF CLASSROOM OBJECTIVES 59 


Each of the content patterns presented above, and many others, 
might logically be developed for a class studying American history. 
Moreover, it is possible to assume that the teachers using the various 
patterns have the same objectives in mind when the classes are or- 
ganized. The reason for using a particular content pattern must come 
from answers to questions such as these : 


1. What content pattern will best promote the development of the 
desired behaviors? 

2. What content pattern can be used most efficiently? 

3. Which content pattern can best be used to capitalize upon stu- 
dents’ interests and levels of ability? 

4. For which content pattern are the greatest number of human and 
physical resources available? : 

5. What aspects of the content are to be emphasized? 


The importance of the organization of the content is to be found 
in the climate of acceptance developed by the students in the class- 
room. If the content can be organized to demonstrate its value to the 
students, the probability of gaining optimum growth is enhanced. 
The organization of the content, as well as the analysis of the be- 
haviors to be achieved, might very well be a function. of teacher- 
student activity. Active participation by students in this process will 
encourage the formation of a climate of acceptance. 

Ideas for developing content patterns may best come through the 
creativity of a teacher or the originality of students; however, stimu- 
lus for this creativity may come from state department of education 
publications, curriculum guides prepared by individual school sys- 
tems, resource units prepared by such groups as the National Council 
for the Social Studies or the National Council of Teachers of Eng- 
lish, descriptions of patterns in professional periodicals and profes- 
sional textbooks, and organization of content in modern textbooks. 


DEVELOPING A WORKABLE PATTERN OF 
CLASSROOM OBJECTIVES 


Effective teaching can be considered as a series of planned ex- 
periences that are designed to enable students to reach self-de- 
termined and/or predetermined goals. It has been indicated that the 
goals in education need to be analyzed in terms of behaviors that 


60` EVALUATING STUDENT PROGRESS 


individuals are expected to attain and in terms of the content by 
which the behaviors are to be developed. Such an analysis should 
assist teachers in thinking about the role they are to play in the 
classroom. To make this analysis functional, that is, to enable the 
teacher to use this analysis in classroom situations, it is helpful to 
devise a system whereby the aspects of behavior and content are 
synthesized. 

Ralph Tyler has said that “the most useful form for stating ob- 
jectives is to express them in terms which identify both the kind of 
behavior to be developed in the student and the content or area of 
life in which this behavior is to operate.” For example, here are 
some “poor” and “better” illustrations of statements of objectives : 


Poor: To acquire broad interests 
Better: To acquire broad interests in reading short stories 


Poor: The struggle for power between Johnson and Congress 
Better: To determine the main issues in the struggle for power be- 
tween Johnson and Congress 


Poor: To develop the ability to think critically 
Better: To develop the ability to compare different points of view 
about modern art 


It is possible to combine an analysis of behaviors and content into a 
single pattern by the use of a two-dimensional grid. 

For example, in a social dancing class the teacher and/or the 
students decide that there are six main areas to be studied: Dancing 
to Slow Music, Rhythms in 34 Time, Enjoying the Swing Steps, 
Learning the Tango, Rhumba Rhythms, and the Mambo Craze. Not 
only does the teacher believe that there are skills to be learned but 
he also expects that students will acquire knowledge about each 
of the areas, develop certain attitudes and interests, and actually 
think critically about each of the areas, Once this over-all pattern 
has been developed, the teacher and/or the students can spell out 
in descriptive terms what the goals are to be in each of the areas. 
How this is to be accomplished is shown in Chart B. 

Regardless of the content area—core, algebra, homemaking, sci- 
ence, physical education, social science, art, or music—it is possible 
to develop a two-dimensional grid analysis of goals. Examples of 

4 Tyler, op. cit., p. 26. 


— À 


DETERMINATION OF CLASSROOM OBJECTIVES 61 


grids for art, core classes, and biology, are shown in charts C, D, and 
E that follow. The goals become a blueprint for the teacher and the 
students. The grid does not assure successful learning; it is simply 
an organizational scheme for relating the behavior and content as- 
pects of educational objectives. : 

To facilitate the construction of a satisfactory grid there are sev- 
eral steps that the teacher must take. He must— 


ils 


2. 


Identify, with the assistance of the class if possible, the behaviors 

that students in that class should be expected to attain. 

Identify, with the assistance of the class if possible, the content in 

which the behaviors are to be developed. 

Construct the grid. The degree of specificity will depend upon the 

individual teacher and the group. Each of the statements should 

represent, as nearly as possible, a concrete description of the goals. 

Study the grid carefully to determine if any areas or behaviors 

have been omitted in the earlier analysis, and revise and modify 

as necessary. 

Evaluate the grid by raising and answering the following ques- 

tions: 

a. Are the behaviors to be achieved stated clearly in terms which 
actually identify the actions of students? 

b. Are the behaviors to be achieved within the realm of attain- 
ability by the students? 

c. Is the content identified clearly and precisely? 

d. Is the relationship between the behavior to be attained and the 
content in which the attainment is to take place understood 
clearly? 

e. Are limits to the number of attainable objectives defined 
clearly? 

f. Can learning experiences be developed which will assist the 
student to attain the desired objectives? 

g. Are the objectives so stated that eventual appraisal of student 
behavior can be made? 

h. Are the objectives sound socially, psychologically, and educa- 
tionally? 


The teacher, having developed with the aid of students the direc- 
tional goals for the semester or the year, is now ready to plan, or- 
ganize, direct, and execute learning experiences, and to evaluate the 
total educative process. 


Chart B. Teaching Social Dancing 


BEHAVIOR 


aea 


CONTENT 


DANCING 
TO 
SLOW 
MUSIC 


RHYTHMS 
IN 


34 TIME 


ENJOYING 
THE 
SWING 
STEPS 


KNOWLEDGE AND/OR 
UNDERSTANDING 


Knows the steps in walk- 
ing to fox trot music 

Has knowledge of the 
proper dance position 

Has knowledge of the pro- 
cedures in executing the basic 
steps and several variations 

Knows the characteristics 
of fox trot music 

Knows the basic methods 
of leading and following 


Knows difference between 
fox trot and waltz 

Knows the name of the 
basic steps 

Is able to repeat knowl- 
edge of the waltz, its origin 
and development 

Knows the procedures of 
executing several variations 


Knows the three swing 
steps 

Knows correct dance posi- 
tion and step motion 

Knows the toe-heel and 
back-step sequence 

Knows names of variations 
and methods of leading and 
following 

Knows basic principles of 
the Lindy Hop and the Cali- 
fornia Swing 


62 


SKILLS 


Is able to walk to music 
using good posture 

Can lead or follow partner 
with ease 

Has developed ability to 
do the basic step, forward 
and backward movements, 
right and left turns and the 
conversation step 

Is able to keep in step 
with the music 

Has ability to organize 
variations in patterns 


Has ability to master the 
box step 

Can perform the basic 
waltz variations, including 
the forward and backward 
movements and the left turn 

Is able to do the four bal- 
ance steps 

Has developed ability to 
do the right and left Parallel 

Can perform the Under- 
arm Turn 


Can master the basic swing 
step 

Develops ability to break 
away and return 

Is able to perform the right 
and left turns 

Has ability to use correct 
hand position 

Has mastered several vari- 
ations including the turns, 
the break-aways, the skip 
step, the triple step, the walk 
step, etc. 


in Physical Education 


ATTITUDES, INTERESTS, 
AND APPRECIATIONS 


Understands the usefulness of 
the ability to dance 

Enjoys dancing to slow music 

Is stimulated to obtain self-con- 
fidence 

Appreciates the aesthetic value 
in music 

Is interested in improving skills 

Is stimulated to try new steps 
outside of class 


Appreciates the beauty and grace 
of the waltz movements 

Understands the importance of 
waltzing in respect to other dance 
steps 

Has an interest in devising new 
variations 


Develops an interest in Latin 
American rhythms 

Enjoys dancing to fast music 

Has a genuine interest in learn- 
ing swing steps 

Understands the movements and 
likes to devise differences in the 
basic steps 


CRITICAL THINKING 


Is able to recognize fox trot 
music 

Can differentiate between proper 
and improper steps 

(Man) Knows what steps to do 
next 

(Woman) Is able to recognize 
the lead and act 

Has good manners when social 
dancing 


Has ability to differentiate be- 
tween fox trot and waltz music 

Can recognize when he is in time 
with the music 

Has ability to organize new waltz 
patterns 


Can differentiate between the 
types of swing steps 

Has the ability to devise and use 
new steps and variations 

Can develop an original style 


Chart B. Teaching Social Dancing 


BEHAVIOR 


CONTENT 


LEARNING 
THE 
TANGO 


RHUMBA 
RHYTHMS 


THE 
MAMBO 
CRAZE 


KNOWLEDGE AND/OR 
UNDERSTANDING 


Knows correct posture and 
dance position 

Knows the procedures of 
stepping to the tango steps 

Has a general knowledge 
of the development of the 
tango 

Knows the names of the 
basic tango variations 


Knows how and where the 
rhumba originated s 

Knows the similarities be- 
tween the waltz steps and the 
rhumba steps 

Knows the names of the 
rhumba steps 


Knows the growth in pop- 
ularity in the U.S. 

Knows the similarities be- 
tween the rhumba and mam- 
bo rhythm, and step sequence 

Has learned the mambo 
step pattern 

Knows the names of the 
basic mambo steps and vari- 
ations 


SKILLS ( 


Has developed the ability 
to master the basic tango 
step 

Is able to perform the 
Right Parallel 

Can do the right turn and 
the Conversation step 

Has developed the ability 
to use several steps including 
the Medio-corte, the Double 
Corte, La Puenta, the grape- 
vine and cross over 


Has developed ability to 
walk using the rhumba move- 
ment 

Is able to keep time to the 
music in performing the basic 
rhumba variations : 

Has developed ability to 
master the basic rhumba 
steps including, the Rhumba 
Box, the forward and back 
step, break-away and turn 


Is able to use correct mo- 
tion in performing mambo 
steps 

Knows how to do the basic 
mambo step 

Has developed the ability 
to perform the following var- 
lations: Left and right side 
Steps with kick, side cross- 
over step (with twist), kick- 
up step (with leg swing and 
left turn and return) 


in Physical Education (cont.) 


ATTITUDES, INTERESTS, 
AND APPRECIATIONS 


Enjoys learning the tango and 
its variations 

Develops an interest in Latin 
American music 

Appreciates the amount of skill 
needed to perform the tango cor- 
rectly 


Has an understanding of the im- 
portance of the rhumba movement 

Develops interest in different 
Latin American dances 

Is stimulated to master steps and 
devise new ones 


Learns to appreciate the mambo 
and the mambo music 

Has an interest in learning more 
complex variations 

Appreciates the value of learn- 
ing the Latin American dances 

Has an interest in showing 
others how to dance 

Enjoys dancing and likes to 
watch others 


CRITICAL THINKING 


Has 
music 

Can 
music 

Has good judgment as to the 
type of step used in a certain situ- 
ation 


ability to recognize tango 


apply basic steps to the 


Is able to apply the basic waltz 
movements to the basic rhumba 
movements 

Can differentiate between the 
rhumba rhythms and other South 
American rhythms 


Is able to differentiate between 
the rhumba and mambo music 

Can recognize the rhythms and 
apply the steps properly 

Has ability to lead partner and 
to follow the lead 

Has ability to pick up new steps 
while watching others 


Chart C. Educational 


BEHAVIOR 

eae KNOWLEDGE AND/OR 
UNDERSTANDING 

CONTENT 


DRAWING The ability to see line and 

contour 

The ability to see form 

The ability to see space 
and space relationships 

The ability to recognize 
texture 

The ability to visualize the 
relationships existing between 
forms 

The ability to recognize 
light and shade 

Knowledge of how form 
moves, round and square 

Knowledge and under- 
standing of drawing media 
and technique . 

Knowledge and under- 
standing of perspective 


SPACE The ability to know and 
ORGANI- recognize design principles 
ZATION The ability to know and 


recognize design elements 
Knowledge of space organ- 
ization 
Knowledge of various ma- 
terials and techniques 


‘COLOR Knowledge of physical 

qualities of light 

Knowledge of lights rela- 
tionship in color 

Knowledge of primary 
colors 

The ability to distinguish 
between different hues 

Knowledge and under- 
standing of how value works 

Knowledge and under- 
standing of the color har- 
mony principles 


66 


SKILLS 


The ability to observe 

The ability to be selective 

The ability to render con- 
vincingly 

The ability to use the dif- 
ferent media and techniques 


The ability to organize 
space 

The ability to apply the de- 
sign principles and elements 
to individual work 

The ability to use various 
materials in design 

The ability to develop 
techniques 

The ability to use color 
effectively 


The ability to mix pig- 
ments 

The ability to use color 
effectively 


Objectives in Art 


ATTITUDES, INTERESTS, 
AND APPRECIATIONS 


The ability to make observations 
of everyday surroundings 

The ability to develop an inter- 
est in other works of art 

The desire to develop new tech- 
niques 

The ability to recognize good art 

The ability to recognize art as a 
means of communication 

The development of a respect 
for (all) art as an art 

The development and stimula- 
tion of original thinking 


The ability to observe design in 
everyday surroundings 

The desire to develop new tech- 
niques 

The ability to observe design in 
other works of art 

The ability to recognize good 
design 

A respect for design in everyday 
living 


An interest in color relationships 

An interest in acquiring the de- 
sired hue by mixing pigments 

A desire to discover new com- 
binations of pigment 

An appreciation of the color the- 
ories 

An appreciation of good taste in 
the color used in everyday life 

An appreciation of improved pig- 
ment 


CRITICAL THINKING 


The ability to abstract everyday 
surroundings 

The ability to apply knowledge 
and understanding to other works 
of art 


The ability to see design in every- 
day living 

The ability to use good taste and 
discrimination 


The ability to use color effec- 
tively 


Chart D. Educational 


BEHAVIOR 


KNOWLEDGE AND/OR 
UNDERSTANDING 


CONTENT 


YOU, 
THE 
TEENAGER 


Knowing the various kinds 
of maturity 

Understanding growth de- 
velopment in emotional, so- 
cial, physical, mental, and 
philosophical maturity 

Understanding the impor- 
tant influences in your life, 
such as hereditary factors, 
environment, ie, family, 
community, school, church, 
etc. 

Knowing all those basic 
needs which make for physi- 
cal well-being 

Understanding physical 
growth pattern 

Understanding your needs 
as a growing person in health, 
personality, and family 

Understanding what is 
meant by a philosophy of life 

Knowing the various stages 
in social maturity 


Knowing and understand- 
ing what parents expect of 
you 

Understanding what to ex- 
pect of your parents 

Understanding of the fam- 
ily as a unit 

Understanding how young 
people feel about their par- 
ents 

Understanding how par- 


SKILLS 


Having ability to interpret 
growth development in emo- 
tional, social, physical, men- 
tal, and philosophical ma- 
turity 

Having ability to locate 
materials on growth develop- 
ment 

Applying principles and 
ideas of growth development 
in different situations 

Having ability to interpret 
the important influences in 
your life 

Having ability to interpret 
your needs as a growing per- 
son 

Having ability to interpret 
how your needs are met for 
your physical well-being 

Having ability to list those 
things which determine a phi- 
losophy of life 


Having ability to interpret 
the demands of your parents 

Having ability to get across 
to your parents what you ex- 
pect of them 

Having ability to partici- 
pate as a member in your 
family 

Being able to locate mate- 
rials in family living 

Having ability to use your 


Objectives in Core Classes 


ATTITUDES, INTERESTS, 
AND APPRECIATIONS 


Being aware of immaturity and 
showing concern for maturing 

Showing concern for, and aware- 
ness of, growth development 

Being sensitive to how our needs 
are not met 

Showing concern for understand- 
ing terms and definitions in growth 
development 

Showing sensitivity to the im- 
portant influences in your life 

Showing awareness of how per- 
sonality needs are met 

Desiring to tend to the needs of 
the adolescent 

Showing concern for understand- 
ing the problems of the teenager 


Being aware of what your par- 
ents expect of you 

Showing concern for what is ex- 
pected of your parents 

Desiring to become an active 
participant of your family unit and 
being aware of what is expected of 
you as a participant in your family 
unit 

Being sensitive to disagreements 
between parents and young people 


CRITICAL THINKING 


Being able to evaluate the evi- 
dence or authority of growth de- 
velopment but at the same time to 
recognize the limitations of these 
data 

Being able to establish the rela- 
tionship of environment to the im- 
portant influences in your life 

Being able to identify those 
things which you do not inherit 

Being able to recognize the un- 
derlying assumptions of your needs 
as a growing person and to draw 
warranted conclusions 

Being able to identify the cen- 
tral issues in a philosophy of life 

Being able to identify your prog- 
ress with your philosophy of life 


Being able to recognize the un- 
derlying assumptions of what your 
parents expect of you 

Being able to establish a rela- 
tionship between what your par- 
ents expect of you and what you 
expect of your parents 

Being able to identify the cen- 
tral issues of being a member of a 
family 


Chart D. Educational 


BEHAVIOR 

: KNOWLEDGE AND/OR 
UNDERSTANDING 

CONTENT 
LIVING ents feel about young people 
IN Knowing the privileges of 

YOUR family membership 
FAMILY Understanding how to get 
(cont.) ^ glong with your family mem- 
bers 


GETTING Understanding what makes 
ALONG upa desirable personality 
WITH Understanding what makes 

PEOPLE for popularity 

Understanding the signifi- 
cance of being a friendly per- 
son 

Understanding the signifi- 
cance of dating 

Knowing and understand- 
ing psychological reasons for 
liking and disliking people 

Knowing how your family 
can help develop friendliness 

Understanding how fami- 
lies can hinder friendliness 


SKILLS 


home facilities to their fullest 
extent 


Having ability to use the 
various traits of a good per- - 
sonality in different situations 

Having ability to partici- 
pate in the school, home, and 
community, and to get the 
most out of this participation 


Having ability to meet and — 


be with others gracefully 
Having ability to make or ` 
accept a date and follow thru 
with it 
Having ability to use and 
develop skills for successful 
dating 


Objectives in Core Classes (cont.) 


ATTITUDES, INTERESTS, 
AND APPRECIATIONS 


Being aware of your parents’ ex- 
pectations, depending upon who 
they are (background, etc.) 


Showing concern for, and a sen- 
sitivity to, the development of a 
sound personality 

Desiring to be popular and being 
aware of what is required to be- 
come popular 

Making a determined effort to 
strive to be friendly $ 

Being sensitive to dating and its 
place in the over-all development 
of you as a teenager 


71 


CRITICAL THINKING 


Being able to recognize under- 
lying assumptions and to draw war- 
ranted conclusions about personal- 
ity and popularity ! 

Being able to establish the rela- 
tionship between dating and the 
ultimate selection of a mate j 

Being able to recognize that 
friendliness is not something you 
are born with but can acquire 


Chart E. Biology: 


BEHAVIOR 


KNOWLEDGE AND/OR 
UNDERSTANDING 


CoNTENT 


POLLI- 
NATION 


FERTILI- 
ZATION 


DEVELOP- 
MENT 
OF THE 
SEED 


SEED 
DISSEMI- 
NATION 


Knowledge of the essential 
organs of the flower 

Knowledge of the auxiliary 
organs of the flower 

Knowledge of pollination 

Knowledge of the func- 
tions of the flower 

Knowledge of cross-polli- 
nation and how nature has 
aided in it 

Knowledge of the disad- 
vantage of self-pollination 


Knowledge of fertilization 

Knowledge of the repro- 
ductive structure of a flower 

Knowledge of what is ac- 
complished by the pollen 
grain 

Knowledge of the process 
and terms in ovule formation 


Knowledge of what hap- 
Pens to the flower after fer- 
tilization 

Knowledge of development 
of the seed from the embryo 

Knowledge of the struc- 
ture of the seed 

Knowledge of the parts of 
the seed which develop into 
the young plant 


Knowledge of how seeds 
are disseminated 

Knowledge of why seed 
dissemination is important 

Knowledge of how nature 
has aided plants in seed dis- 
semination 


72 


SKILLS 


Ability to analyze the parts 
of the flower 

Ability to read biological 
content with understanding 
and satisfaction 

Ability to perform funda- 
mental operations of dissect- 
ing flowers with reasonable 
accuracy 

Ability to define the prob- 
lems of self-pollination and 
cross-pollination 

Ability to read and inter- 
pret diagrams, graphs, charts 


Ability to interpret facts 
about fertilization and repro- 
duction 

Ability to perform simple 
manipulatory activities with 
Science equipment in dissect- 
ing flowers 

Ability to read and inter- 
pret diagrams and charts on 
the process of ovule forma- 
tion 


Ability to draw conclu- 
sions about fertilization in 
the flower 

Ability to define terms in 
the vocabulary of flowers 

Ability to organize and 
draw diagrams of reproduc- 
tion of the flower 

Ability to experiment with 
growing seeds 


Ability to analyze nature 
about us to gain new facts 

Ability to study the situ- 
ation for all facts and clues 
bearing upon the problem of 
Seed dissemination 

Ability to make the best 
tentative explanation or hy- 
potheses 

Ability to draw conclusions 


Reproduction of me Flower 


ATTITUDES, INTERESTS, 
AND APPRECIATIONS 


Attitudes of open-mindedness 
and willingness to consider new 
facts about flowers 

Attitudes of intellectual honesty 
and scientific integrity 

Appreciation of the contribu- 
tions of scientists 

Interest in some phase of science 
as a recreational activity or hobby 

Interest in science as a vocation 


Sensitivity to possible uses and 
applications of science in personal 
relationships, and disposition to use 
scientific knowledge and ability in 
such relationships (attitude) 

Desire to broaden and develop 
scientific concepts of flower repro- 
duction 

Interest in experimentation 

Interest in developing the ability 
to make careful observations 


Interest in recording data and 
keeping records of flower reproduc- 
tion 

Interest and faith on the part of 
each individual in his own ability 
to solve problems 

Attitudes of willingness to con- 
sider evidence objectively 

Attitudes of willingness to sus- 
pend judgments 


Attitudes of respect for another's 
point of view; open-mindedness 
and willingness to be convinced by 
evidence 

Attitudes of weighing evidence 
with respect to its pertinence, 
soundness, and adequacy 

Attitudes of cautiousness in an- 
nouncing and accepting ideas about 
the environment 


CRITICAL THINKING 


Ability to analyze the reading of 
scientific material on flower repro- 
duction 

Ability to analyze the parts of 
the flower 

Ability to organize the material 

Ability to weigh evidence 


Ability to draw charts, tables, 
and diagrams of the floral parts 
and processes of reproduction 

Ability to interpret or explain 
facts about reproduction in the 
flower 


Ability to dissect the flower and 
the seed to discover new facts 

Ability to test conclusions 

Ability to apply facts to new 
situations 


Ability to draw conclusions, that 
is, to render judgment 

Ability to compare and develop 
one’s own opinion 

Ability to comprehend the sub- 
ject material 


73. 


CHAPTER 
5 


A Measurement Rationale 


‘TEACHERS ASK many questions about the use of tests, rating scales, 
inventories, and other instruments of measurement. They want to 
know: How can I tell if it is a good test? Why do students taking 
the same test score differently at different times? How can this test 
be “fair” for my students when it is testing them over things that we 
haven’t even studied? Why is one test supposed to be a good test 
because it correlates highly with another test? When I use an interest 
inventory, how can I be sure that the students are really expressing 
their interests? Of what use is a standardized test if the results are 
not good indicators of what the students are actually doing in their 
classes? Are teacher-made tests more valid than standardized tests? 
Why should rural teachers use tests that were made for children 
living in the city? 

All of the questions listed above and the many others that teachers 
ask about tests and testing have been and are perplexing issues for 
test makers, test publishers, and test users. The concern of these 
various groups has led to the formulation of statements designed to 
establish standards of approved practice for the guidance of both 
users and producers of tests. The extent to which these standards are 
met in the future will be a measure of the growth of the measurement 
and evaluation movement. 

Despite the vast differences between format and construction of 
various evaluation devices, there are certain common standards or 
criteria that should be met by any measuring instrument. It matters 


74 


A MEASUREMENT RATIONALE 75 


not whether one uses a true-false test, an essay test, or a rating scale, 
each should be chosen carefully in terms of the criteria of (1) valid- 
ity, (2) reliability, (3) objectivity, (4) efficiency, and (5) useful- 
ness, It should be understood that not all appraisal techniques are 
equally objective, nor can we be sure that an indirect measure is as 
valid as a direct measure of growth toward a particular objective. 
To the extent that any given technique satisfies these criteria are we 
justified in using it in our work with students. These criteria will be 
briefly explained in the following sections. 


VALIDITY 


The most important characteristic of any appraisal technique is 
validity—the extent to which the technique actually measures what 
it is supposed to measure. For purposes of illustration, suppose we 
have constructed an objective test to measure status with regard to 
the objective: understanding current social problems. To what ex- 
tent does our test actually measure such understanding? Is the test 
so complex that it measures “intelligence” more than competence in 
subject matter; or is it so highly weighted with items of basic vocab- 
ulary that it permits us to draw few inferences about higher level 
understandings and applications of knowledge? To what extent do 
the items in the test measure really important goals of instruction? 
These, among many others, are the questions we must answer before 
we can fully justify using a particular test or other evaluative device. 

The concept of validity is precisely described in manuals prepared 
by committees of the American Psychological Association, the 
American Educational Research Association, and the National Coun- 
cil on Measurement Used in Education. The following statement 
incorporates materials from these manuals. 


Validity information indicates to the test user the degree to which the 
test is capable of achieving certain aims. Tests are used for several types 
of judgment, and for each type of judgment a somewhat different type 
of validation is involved. We may distinguish four aims of testing: 

1. The test user wishes to determine how an individual would per- 
form at present in a given universe of situations of which the test 
situation constitutes a sample. 

2. The test user wishes to predict an individual’s future performance 
(on the test or on some external variable). 


76 EVALUATING STUDENT PROGRESS 


3. The test user wishes to estimate an individual’s present status on 
some variable external to the test. 

4. The test user wishes to infer the degree to which the individual 
possesses some trait or quality (construct) presumed to be re- 
flected in the test performance. 


Thus, a vocabulary test might be used simply as a measure of present 
vocabulary, as a predictor of college success, as a means of discrimi- 
nating schizophrenics from organics, or as a means of making inferences 
about “intellectual capacity.” 

To determine how suitable a test is for each of these uses, it is neces- 
sary to gather the appropriate sort of validity information. Four types 
of validity have been distinguished, namely, content validity, concurrent 
validity, predictive validity, and construct validity. 

Content validity is concerned with the sampling of a specified uni- 
verse of content. 

Concurrent validity is concerned with the relation of test scores to an 
accepted contemporary criterion of performance on the variable which 
the test is intended to measure. 

Predictive validity is concerned with the relation of test scores to 
measures on a criterion based on performance at some later date. 

Construct validity. More indirect validating procedures, which we 
refer to under the name construct validation, are invoked when the pre- 
ceding three methods are insufficient to indicate the degree to which the 
test measures what it is intended to measure. 


Validity may be determined in a number of ways; some of these 
involve curricular approaches while others employ statistical 
analyses, 

There are three major techniques for the establishment of the 
content validity of measuring instruments, these being (1) direct 
comparison with objectives of instruction, (2) comparison with “ex- 

. pert" opinion, and (3) comparison with textbook and source ma- 
terials. The latter two approaches are based on the idea that con- 
sensus by experts and textbook writers concerning the important 
objectives and content in a particular area adequately define for 

* “Technical Recommendations for Psychological Tests and Diagnostic Tech- 
niques," supplement to the Psychological Bulletin, LI, 2, Part 2, March, 1954, p. 13. 

“Technical Recommendations for Achievement Tests," report prepared by Com- 


mittees on Test Standards of the AERA and NCMUE (Washington: National Edu- 
cation Association, January, 1955), p. 16. 


A MEASUREMENT RATIONALE 77 


teachers what these objectives should be. To a major degree this is 
a sound assumption, for these people have studied the field carefully 
and should have a good idea of what are valid objectives. Then, too, 
analyses of most of the major studies of objectives reveal a surpris- 
ing amount of agreement as to the objectives of instruction for a 
grade or area even though the content used by teachers in different 
schools may be different. Even the content used in different schools 
is more apt to be alike than different, although wide variations in 
this regard do exist. The important thing to be noted, however, in 
establishing validity for a test or device by these “consensus” meth- 
ods, is that mere conformity to what the experts agree to may not 
be an adequate test of validity for an individual, school, or grade, To 
the extent that the school or grade in question has defined its ob- 
jectives in a different fashion or in terms of different goals than the 
consensus, the establishment of content validity by comparing the 
instrument in question with expert opinion or textbook analyses has 
limited value. By far the most basic test of the content validity of 
a measuring instrument involves a direct comparison of the actual 
behaviors involved in the stated. objectives of instruction with the 
behaviors needed for success on the test. 

Content validity of a test or other evaluative device may be se- 
cured if a grid analysis procedure is employed in the construction of 
the test. The grid approach helps assure that the items in the test 
are samples of subject-matter content and that the behavioral ob- 
jectives of the course or lesson are represented in the test in propor- 
tion to their importance. The content validity of testing devices us- 
ing a less structured approach can also be secured if definite attempts 
have been made during the construction process to include only those 
items that relate directly to the objectives of the school. 

For tests obtained from commercial sources or not made by the 
teacher who will actually use them, content validity can also be 
determined by a comparison with teaching objectives. Some schools, 
for example, have found a certain standardized achievement test 
battery to lack validity for their purposes because the content of the 
tests is limited almost exclusively to items measuring information 
and knowledge. These schools, which, of course, do have objectives 
stressing acquisition of knowledge, have aimed their major instruc- 
tional efforts at the level of understanding or application of knowl- 


78 EVALUATING STUDENT PROGRESS 


edge. The tests, therefore, lack content validity for these schools. 
In much the same way, a school that enrolls students of superior 
academic ability may find that many tests intended for use with 
average-ability youngsters may lack validity for them because the 
tests do not probe the students’ competence at a high enough level. 
As a measure of general knowledge and application, such a test 
might be valid for 90 per cent of the high schools, but for the remain- 
ing 10 per cent, because of differing objectives, it would lack suf- 
ficient validity to warrant its use. 

Everything that has been said about establishing the content 
validity of tests, both teacher-made and standardized, holds true for 
all other kinds of evaluative devices. Rating scales and checklists 
must be based upon the behaviors defined in the instructional ob- 
jectives or their use can be seriously questioned. A rating scale of 
character traits, for example, must provide measures of those traits 
which the school is attempting to develop or the scale lacks validity 
for the school. In the area of personality measurement through the 
use of paper-and-pencil devices, teachers must be very careful to note 
whether the test maker’s definition of personality agrees with that 
of the school, for there are about as many definitions of personality 
as there are tests. One must go far beyond the printed title of a test 
into a careful analysis of the actual content, level of competence de- 
manded, definition, and so on before one can be assured that it is 
valid for use in any given teaching situation. In the area of direct 
observation, teachers must ensure that they observe those behaviors 
defined as most important in their objectives, and not confine them- 
selves to observing only the most easily observed behaviors. Sum- 
maries of anecdotal records, for example, which reveal that practically 
all of the observations from a particular teacher stress competence 
in only one or two areas, suggest a low degree of validity as measures 
of the student’s over-all competence and adjustment. In much the 
same way, a teacher whose observations stress only negative behaviors 
may be said to be using a less valid approach than the teacher who 
looks for and records observations of both positive and negative 
behaviors. 

The concurrent or predictive validity of a particular measurement 
device can likewise be established in several ways, and generally in- 
volves the calculation of coefficients of correlation (described in some 


A MEASUREMENT RATIONALE 79 


detail in Chapter 14) between the test and some independent criteria 
of success. A newly devised test, for example, may be compared 
statistically with an established test of accepted validity to determine 
how well the new test measures those qualities measured by the old. 
This is a common technique in the development of new forms of a 
test, in which the general objectives remain constant but where new 
content or new items are used in succeeding forms. The test of valid- 
ity here is one of determining to what degree two forms of the instru- 
ment are measuring common objectives. Many of the existing paper- 
and-pencil intelligence tests have been validated by computing 
correlation coefficients between scores on the new test and scores ob- 
tained from the Stanford-Binet scale. The higher the correlation 
value, the more sure we are that this new test is valid for measuring 
those qualities of intelligence measured by the Stanford-Binet scale. 
If, however, the new test is based on a different conception of in- 
telligence and is presumed to measure different qualities of aca- 
demic aptitude, then we would expect low correlations (indications 
of little validity) with scores from the Stanford-Binet, our independ- 
ent criterion. 

Very often educational measurements are validated by computing 
the relationships between these measures and some other measure of 
success in school, such as average grades or an “honor point ratio.” 
Assuming that these latter criteria are valid in their own right as 
good indicators of the individual’s attainment of the objectives of 
the school, a questionable assumption in some cases, then a measure 
of competence that produces scores in substantial agreement with 
these criteria is said to be valid as a measure of attainment of the 
school objectives. A test devised as an over-all final measure of a 
person’s competence with regard to the objectives in a high school 
physics course can be said to be a valid measure if scores on the test 
substantially agree with a rank ordering of students made from a 
variety of formal and informal evaluations made during the school 
year. In another sense, a typing test could be said to be valid if all 
those students who had scores above the average were successful 
in post-high school typing positions, while all of those students who 
had below-average scores turned out to be dubious or actually poor 
typists in later on-the-job situations. For students in a similar class 
in the future, then, the teacher could make reasonably valid predic- 


80 EVALUATING STUDENT PROGRESS 


tions of their later typing success purely on the basis of this one 
test administered during high school. 

Concurrent and predictive validity of many of the informal de- 
vices to be described in later chapters of this book can be determined 
statistically through comparisons of scores or qualitative descrip- 
tions of behaviors with peer ratings or with some other type of rating 
by associates, Such ratings, lacking the objectivity and proven relia- 
bility of some of the more traditional testing procedures, neverthe- 
less are often stable and trustworthy enough to allow their use in 
establishing statistical validity. If later careful analysis through a 
case study shows, for example, that a student has been subjected to 
a host of personal adjustment problems that tended to interfere 
with efficient learning, such evidence would tend to validate this 
student’s high intelligence test score which had been made suspect 
by rather low achievement. In like manner, a student singled out 
for special study because of a suspicious “social adjustment” score 
on a personality inventory might turn out to be the person who 
tends to be rejected by other members of his group when various 
sociometric devices are used. In this case, the evidence points to 
the basic validity of the “social adjustment” score in identifying 
social isolates among the student body. 

The importance of analyzing a measuring instrument for con- 
struct validity is presented in the manual, “Technical Recommen- 
dations for Achievement Tests.” 


Construct validity is highly important in achievement testing. In the 
first place, universes of content, while relatively easy to specify in the 
traditional school subjects, neverthleless are very difficult to specify in 
many important areas such as study skills, attitudes, interests, under- 
standings, and appreciations. Second, altho in some cases criteria for 
concurrent validity can be obtained, as in the example aboye, this is not 
generally the case, In fact, for the purposes of most educational achieve- 
ment tests, no suitable concurrent criterion of performance has usually 
been available to determine validity in terms of a correlation coefficient 
or other quantitative objective measure. Such criterion data as teachers’ 
ratings, Course grades, or teacher-constructed examination grades are 
generally inadequate for this purpose and frequently inferior to the tests 
for which they are to be used. The same considerations apply to predic- 
tive validity in many situations. 


A MEASUREMENT RATIONALE 81 


The use of construct validity in achievement testing may also be clari- 
fied by an illustration with a test of study habit skills, that is, a test not 
in a traditional subject. What evidence might be presented to support 
the claim that such a test is valid? For a given group of students predic- 
tive evidence of expected achievement status at the end of the year in 
a given course is available. The predictions are based on both a prog- 
nosis test in the subject and a general scholastic aptitude test. The corre- 
lation between scores on these two tests and final achievement status for 
similar groups in the past has been above .70. From among the several 
hundred students in the group, 30 are chosen who after the middle of 
the year are far above their predicted achievement status. Another 30 
are chosen who at that time are far below their predicted achievement 
status. t 

It is hypothesized that differences in study habit skills are in large 
measure responsible for the fact that the first group of students are doing 
much better than is expected of them and that the second group are 
doing much less well than is expected of them. If this hypothesis is 
sound, the items on the study skills test and the test as a whole should 
discriminate significantly between the two groups. The selection of the 
items to remain in the test would'then be based on the degree to which 
they discriminate between the two groups provided other variables, such 
as home background and academic preparation, are controlled. The 
validity of the test in its final form would be determined by its ability 
to discriminate between two other groups similarly chosen. 

The evidence for validity presented in this instance is clearly evidence 
of construct validity. It is by the very nature of the case inferential 
rather than conclusive. The acceptance of this evidence by a prospective 
user of the test would depend on his acceptance of the hypothesis stated 
above, on the method employed to secure and analyze the data, and on 
the degree to which the items discriminated between the groups who are 
presumed to differ in study habits.? 


RELIABILITY 


The reliability of a measuring instrument refers to the consistency 
with which it measures, or the extent to which it can be trusted to 
give us the same or similar scores or descriptions of behaviors 
at different times. Let us take two extreme cases. First, let us as- 
sume that we have a history test, which was administered to a 

2 "Technical Recommendations for Achievement Tests,” report prepared by Com- 


mittees on Test Standards of the AERA and NCMUE (Washington: National 
Education Association, January, 1955), pp. 17-19. 


82 EVALUATING STUDENT PROGRESS 


tenth-grade class on two successive Tuesday afternoons. Every 
student received the same score on the second administration of 
the test as on the first. This test, we could then say, is 100 per cent 
reliable—we can have complete confidence that the score we would 
get on any single testing would be a true or reliable measure of 
any particular student’s achievement, for we have already seen 
that the scores didn’t change on a retest a week later. 

The opposite extreme would occur if, on another history test, none 
of the students attained the same score on the second testing that 
they got on the first. As a matter of fact there appeared to be little 
relationship at all between any individual's two scores, for those who 
had high scores on the first testing had scores over the entire range 
of scores on the second. The same was true for those who received 
very low scores on the first testing, their second scores extending 
over the complete range of scores, from the highest to the lowest. 
We would then say this test had very low reliability, for we could 
have no confidence at all that the score an individual student at- 
tained on a single testing would be repeated if the test were taken 
again. Since it is only in very exceptional cases that a test can or 
should be administered twice to students, we need the most reliable 
measures that we can get, so that we can be as confident as possible 
that the results of a single testing will represent a true measure of 
the student's achievement. 

Reliability is usually represented in terms of correlation co- 
efficients. A test that is perfectly reliable would thus be said to have 
a reliability coefficient of + 1.00, while the opposite extreme de- 
scribed above would have close to a 0.00 coefficient. 

Test reliability is established commonly in one of three ways, 
either by the test-retest, the comparable forms, or the split-halves 
method. Each method has its advantages and disadvantages. 

The test-retest method involves two separate administrations of 
the same test to a single group of students. Testing conditions are 
made as standard as possible so that the only factor theoretically 
contributing to variation of attainment between administrations is 
that of internal instability of the instrument itself. A suitable time 
interval is chosen between testing, usually a week or more, to mini- 
mize the possibility of students remembering individual choices 
made on the first testing for items about which a definite or confi- 


——-—- co 


A MEASUREMENT RATIONALE 83 


dent response could not be made. At the same time, care must be 
taken to ensure that the same content or factors are being measured 
at the two testings and that actual learning has not occurred be- 
tween them. An extreme example of this problem would be one in 
which a semester was allowed to pass between testings to minimize 
individual item recall in a test in American history, but during which 
semester students studied history. It is obvious that scores on the 
two administrations would differ according to the learning rates for 
individual students, the resultant correlations showing little about 
the consistency of the test as a measure. Actually, while this is an 
extreme case, it illustrates a weakness of the test-retest method, for 
we can never rule out the differential learning that takes place be- 
tween the test and retest, even if a very short time interval is al- 
lowed to elapse. Some students will have their interest aroused about 
particular items for which the true answer is not known during the 
first administration and will try to satisfy their intellectual curiosity 
before a retest, whether they realize a retest is going to be given or 
not. Others will make no effort to find the answers to questions they 
could not readily answer on the first administration of the test. Thus, 
while we might suspect that students who do not study after the 
first testing will make about the same score on the retest as on the 
first testing, it is also reasonable to suspect that those students who 
do some studying will better their scores on the second administra- 
tion to the extent to which real learning has taken place. Thus, this 
learning factor will operate to reduce the possibility of identical or 
closely related scores during the test-retest and will contribute to 
an underestimate of the true consistency with which the test meas- 
ures—its reliability. 

The comparable forms method for establishing reliability is limited 
to those cases in which two or more forms of the same test are de- 
veloped. This is a common approach in the development of stand- 
ardized tests where several forms of the test are considered desir- 
able. The method assumes that the content on the two forms of the 
test is comparable in a curricular sense and that equal weight is 
given the common objectives measured even though the actual items 
in the two forms will be different. This is at once the advantage and 
disadvantage of this method, for, while we have ruled out the learning 
factor by changing the actual item content in the two forms of the 


84 EVALUATING STUDENT PROGRESS 


test, we have introduced some doubts as to the actual comparability 
of the items chosen for each form. Even when the two forms of the 
test are given to the same students under absolutely constant con- 
ditions, insofar as this is possible when one deals with human 
beings, the very fact that the two forms of the test are measuring 
common objectives with items closely related but not identical in 
the two forms suggests that there is some learning that might take 
place in addition to the variation injected into the situation by the 
use of nonidentical content. Again, our correlation coefficients would 
tend to underestimate the true reliability of the test. 

The split-halves method is based on the single administration of 
but one form of a test. Thus, the effects of different testing condi- 
tions, differences in student’s actual physical conditions and health 
at the different testing times, and major differences in item content 
are ruled out. The actual analysis or determination of reliability is 
made on halves of the test, the total test being divided into two 
parts, generally by getting one score for all the even-numbered items 
in the test and a second score for all the odd-numbered items. (A 
variation would be to arrive at a score for the first half of a test 
and a score on the second half, but this involves a fatigue factor and 
a greater possibility of differences in item difficulty than would be 
found in an odd-even split of the test.) The two scores derived from 
the test are then correlated. These scores, it should be noted, actu- 
ally represent two tests, each one being just half as long as the orig- 
inal or total test. Since it is generally true that increasing the length 
of a test increases its reliability, it is necessary to correct the co- 
efficient calculated from the two small tests and predict what the 
correlation would be had the two scores been based on the same 
number of items as were found in the original test. This is done by 
the use of the Spearman-Brown prophesy formula. (This formula 
and the statistical procedure to be followed in computing this co- 
efficient may be found in any standard elementary statistics book.) 
It should be pointed out that the split-halves technique is applicable 
only for “power” tests, in which all students have a chance to finish 
all the test items, 

Teachers often ask, “How large should the correlation be to indi- 
cate a really reliable test or instrument?” While we might answer, 
“As large as possible,” a more objective answer would be that we 


A MEASUREMENT RATIONALE 85 


should strive for reliabilities of at least .90 for standardized and 
teacher-made objective tests. However, two main points might be 
added to qualify. this answer. To a degree, standards of reliability 
depend upon the use to which the test is to be put, with an increase 
in reliability called for as the emphasis placed on test results in- 
creases. For example, since many teachers tend to interpret the 1.Q. 
in a more absolute fashion than may be justified, we need extremely 
high consistency in this type of measurement, and values of .90 
would be an absolute minimum. In like fashion, test results that are 
to be the basis for a grade or an award of some kind should be 
ultrareliable. On the other hand, if a diagnostic test is given with 
the idea that responses to individual items will be the basis for re- 
medial work or as clues for further intensive exploration, with little 
emphasis placed on an over-all score for the test, then high standards 
of over-all reliability are not needed. In like fashion, if items on a 
personality inventory are to be used as openings for student inter- 
views, it is not essential that the over-all test have a reliability of 
90 or higher, unless over-all scores are also to be used as measures 
of student adjustment in the more usual test sense. The second 
main qualification one must add in this matter of test reliability re- 
fers to the maximum reliability one can reasonably expect for some 
of the more informal techniques a teacher may use. For some of 
these techniques, essay tests, for example, it is quite unrealistic 
to say that, unless they have a reliability coefficient of .90 or higher, 
we should either not use them or that we should have little confi- 
dence in the results we get from their use. Essay examinations just 
don’t lend themselves to such rigid standards of reliability. We 
might find that a rating scale we wish to use is of such a nature that 
even a well-trained observer using the scale in two administrations 
can agree with himself with respect to only 80 per cent of the items 
rated. Does this mean that we should not use the rating scale at all, 
or that we should not use essay tests because reliability coefficients 
for essays are generally far below our .90 standard? 

Our answer would be “no,” if these tests are the best and most 
valid techniques one can use to measure the behavior or knowledge 
concerned. If other techniques of equal validity and of greater re- 
liability exist, then we should use these other techniques. However, 
if we would have to sacrifice validity for reliability, then we should 


86 EVALUATING STUDENT PROGRESS 


use the more valid but less reliable instrument. A general rule that 
might be proposed at this point would be that we should strive to 
use the most reliable and valid instruments at our disposal. If a test 
has a reliability coefficient of less than .90 but is still the best meas- 
ure we have at hand, we should use it and strive at the same time to 
improve it so that it can be made more reliable. We should, however, 
expect less precise results as our reliability decreases and not try 
to interpret our results as being as trustworthy as they would be 
with a more reliable instrument. 


OBJECTIVITY 


Closely related to the concept of test reliability, and indeed con- 
tributing to it, is the concept of objectivity. This concept has refer- 
ence to clearness of meaning and procedure, the establishment of 
a set of conditions leading to standard procedure and interpretation, 
and the elimination of all elements of personal bias. For example, 
extreme care must be taken in the wording of test questions so that 
all students will understand the question in the same way. An 
essay question, for example, that asks students to “discuss” a par- 
ticular issue without defining exactly what the student should in- 
clude in his discussion might be said to lack objectivity. In like 
fashion, a rating scale that calls for ratings of “cooperation” or 
“leadership” without specifying with very definite behavioral descrip- 
tions what these terms mean can produce almost as many different 
ratings as there are raters. 

A fairly common assumption made by most teachers is that test 
questions, questionnaire items, and the like, have a common mean- 
ing for all their students. However, it takes but a little probing of 
student responses to particular test questions, including those of the 
objective type, to disclose that often what we think is a clear state- 
ment of intent or meaning can be variously interpreted by our 
students, In another case, most of us have probably received at one 
time or another a questionnaire or some type of inventory of opinion 
or interest in which there are one or more items which we struggle 
to answer because we are just not sure what the author meant. 

A test, rating scale, questionnaire form, or any other kind of 
evaluative instrument must be very carefully constructed and, if 
possible, pretested, to insure that the intent of each item is clear 


A MEASUREMENT RATIONALE 87 


and that all responses to a particular item will lend themselves to a 
single standard interpretation. In the case of standardized tests 
this process of experimental tryout, analysis, and revision is one of 
the best guarantees that the test so developed will be useful in pro- 
viding a standard measure of achievement or ability. 

Fully as important as provision for objectivity in test construc- 
tion is the need for a standard procedure for the scoring of such 
examinations. Objectivity with respect to grading is a particular 
problem with free-response or essay tests. Unlike most objective- 
type test items which are so designed that personal bias is elimi- 
nated and a single correct answer is established for all students, 
essays, unless carefully used, easily lend themselves to the exercise 
of bias and prejudice. This need not be the case, as will be explained 
in detail in Chapter 8, for there are ways in which essays can be 
graded objectively if the teacher will but take the time and effort 
needed to use this technique as it should be used. , 

Grading procedures that are poorly thought out, that permit any 
kind of positive or negative bias, that are influenced by irrelevant 
details, and that try to compress into a single grade evaluation of 
several quite different abilities or skills have no place in a program 
of functional evaluation. 

Objectivity in terms of test or inventory construction is a “must,” 
so that our instruments have a standard meaning for all students. Of 
equal importance is the need for objectivity in grading and using the 
results of measurement. 


EFFICIENCY 


Efficiency, as used in this book, refers to economy of time, effort, 
and money in the process of making meaningful and useful evalua- 
tions of student status and progress. This is not to infer that the 
method that takes the least time, effort, and money is the best, for 
in measurement, as with so many other areas of life, we receive only 
what we pay for and have to judge our expenditures in terms of the 
actual values received. Each of these means of achieving test 
efficiency will be treated separately, but the reader should under- 
stand that real efficiency is obtained in a total situation through a 
wise combination of these and other factors. 

1. Time. A very common complaint of teachers is that they are 


88 EVALUATING STUDENT PROGRESS 


expected to spend so much time evaluating student behavior that 
they haven’t any time left to teach. While the writers would hold 
that little good teaching can be accomplished unless the teacher 
knows through comprehensive evaluation the particular strengths 
and weaknesses of students, of his own teaching, and of the total 
school and community situation as they relate to effective learning, 
there is certainly a need for an efficient use of the teacher’s time in 
connection with evaluation activities. 

To illustrate this point of efficient use of time, let us suppose that 
a school is in the process of selecting a new test of mental ability 
for use in the high school. The choice has narrowed down to test 
X and test Z. Both provide but a single mental-age score which can 
be converted into an LQ., both are machine-scorable, both cost 
the same amount of money, both have reliability coefficients in the 
high .90's, and the validity of the tests is about equal when corre- 
lated against school grades. Test X requires ninety minutes of actual 
testing time, plus another fifteen to twenty minutes for giving direc- 
tions, and so on. Test Z can be given in but forty-five minutes, in- 
cluding the time necessary for administrative detail. Other things 
being equal, efficiency of time would indicate the choice of test Z. 
As other factors in the testing situation differ, however, a different 
decision may be called for at the expense of economy of time. Con- 
sider, for example, the above case, but let us suppose that test X 
has a reliability coefficient of .94 as opposed to one of .87 for test Z; 
test X's validity is described by a coefficient of .63 as opposed to one 
of .45 for test Z. Other factors are equal. In this case, pure economy 
of time would suggest the choice of test Z, but this would probably 
be a poor choice when all factors are considered. 'The point must be 
Stressed that we must not sacrifice validity and reliability for 
reasons of economy, for the over-all value and use of an evaluation 
technique in guiding student learning must be the primary considera- 
tion in the choice of such techniques. 

A second illustration of the concept of efficiency of time refers 
to the need for a quick and faithful summarization of anecdotal 
records, written descriptions of actual behaviors in a variety of 
situations. Many of the school systems that have adopted this sys- 
tem of reporting student behaviors have dropped it because they 
found that summarization of the reports took too much time. That 


A MEASUREMENT RATIONALE ` 89 


this need not occur is noted later in Chapter 10 where anecdotal 
records are discussed, for there are efficient means of summarizing 
such data that require but little of a teacher’s time. In considering 
the several methods of summarization, time stands out as a major 
consideration, so that the method one might recommend could be 
the one that provides a meaningful summary with the least ex- 
penditure of time. eae d 

One might add parenthetically at this point that total testing time 
should be very carefully evaluated when»one is in the position of 
choosing alternative tests or methods. For.example, some standard- 
ized test manuals give a figure that rep! t the amount of time 
needed by students to complete a certain test. “hus, it may be 
that both test X and test Y are described as thirty-five-minute tests. 
A teacher planning to use one or the other test in a forty-minute 
class period may conclude that either one could be administered in 
this time. Experience, or reading of the manuals, may show how- 
ever, that test X needs only a preliminary introduction and that 
after the mechanics of distributing the test booklets, answer sheets, 
and pencils to students is accomplished, there is but a need for 
thirty-five minutes of testing time. Test Y on the other hand, while 
requiring only thirty-five minutes of actual testing time, may be 
in several parts, each of which requires the reading of a new set of 
directions and the completion of a practice exercise. Actual adminis- 
tration of the test may call for upwards of fifty to fifty-five minutes. 
Other factors being equal again, an efficient use of teacher and 
student time would dictate the use of test X. 

2. Effort. Closely allied to the concept of efficiency of time is that 
of effort as it relates to both students and teacher. Perhaps the 
major factor in reference to this concept relates to administrative 
detail in the evaluation program—scoring, summarization, reporting, 
and so on. This can perhaps be best illustrated with reference to 
the Kuder Preference Record—Vocational. While one of the con- 
siderations relative to the use of this interest inventory in the 
secondary school would certainly be one of money, a major con- 
sideration in the choice between the hand-scored and the machine- 
scored forms of the inventory would involve economy of effort. While 
it cannot be denied that a large number of tests can be scored with 
the least amount of actual effort on an IBM machine, assuming that 


90 EVALUATING STUDENT PROGRESS 


a machine was available and that one person would be responsible 
for the complete scoring of either the hand- or the machine-scored 
forms, one should consider the possibility of students scoring their 
own inventories using the hand-scored forms. In addition to this 
being a powerful motivational force with students, the pure saving 
of scoring effort resulting from students counting up their own 
scores and constructing their own profiles should be carefully con- 
sidered when making a choice of answer sheets. Efficiency of effort 
would also apply in the previous illustration with regard to economy 
of time in summarizing anecdotal records. Not only time but a great 
deal of physical energy can be saved through efficient summarization 
procedures, ^ ,* 

A third illustration of economy of effort, especially as it refers to 
grading procedures, concerns the use of separate answer sheets for 
objective tests. If, for example, a multiple-choice test of sixty items 
is given a class and the students are instructed to record their 
answers directly on the test booklet of six to seven pages, the scoring 
procedure required for the test in.this form would involve much 
more effort than would one utilizing a separate answer sheet. The 
sheer mechanics of scoring eight or nine items on one page, turning 
over that page, scoring another nine or ten items, turning that page, 
and so on, involves considerable effort. The effort involved in run- 
ning through the scoring of sixty items, all simply and uniformly 
recorded on a separate answer sheet that requires no page turning, 
would most certainly be less. Thus, the choice between recording 
answers directly on the test booklet or on separate answer sheets 
could be made purely on the basis of economy of effort. 

3. Cost. One of the most important aspects of evaluation pro- 
cedure that needs continuous study relates to actual financial cost. 
Especially is this true in these days of competition for the taxpayer's 
dollar, need for major capital expenditures for buildings and basic 
equipment, and need for an increase in the number of teachers and 
in the level of salary paid teachers. Economy in dollars and cents 
is a must in the educational program. 

There are many places in the total evaluation program where 
decisions affecting dollar expenditures must be made, these expendi- 
tures being either direct or indirect. Indirect expense, for example, 
is tied up in the efficient use of time and effort, for, as the teacher 


A MEASUREMENT RATIONALE 91 


uses his time and effort efficiently, he is also using the taxpayer’s 
dollar efficiently. In making a choice between hand-scored or ma- 
chine-scored editions of standardized tests, one of the prime con- 
siderations should be cost. While there are good reasons why hand- 
scored editions of some standardized tests might be chosen (the 
actual test with the students’ responses entered on the booklet can 
be kept for diagnostic and referral use in the student’s cumulative 
record file, for example), one must consider that most of these edi- 
tions can be used but once. Machine-scored forms that permit the 
use of separate answer sheets, on the other hand, can be used over 
and over again, the only recurring expense being for the separate 
answer sheets. 

The factor of cost enters into the use of teacher-made examina- 
tions also, for decisions involving this factor must be made in such 
cases as whether a short quiz should be written on the classroom 
blackboard or if it should be machine duplicated. The number of 
examinations to be given during a semester and their length may 
have to be decided in part upon financial bases, although this 
should be strictly subordinate to the educational values to be gained. 


USEFULNESS 


The last major characteristic of a good test to be discussed here, 
and the reader should be aware of the fact that there are other 
criteria, is usefulness. In a sense, this is perhaps the most crucial of 
all the criteria, for unless our evaluation techniques and procedures 
produce results that are useful in helping boys and girls to learn 
efficiently, we may be hard pressed to justify the use of such tech- 
niques. 

Usefulness can be illustrated in many ways. One illustration from 
the area of standardized achievement testing may be pertinent. 
Many test experts and statisticians argue that standard-score norms 
are the most meaningful and useful methods of reporting student 
achievement, their superiority over percentile systems being great. 
While, in a technical sense, this is probably quite true, the fact 
remains that few teachers fully understand what standard scores 
are or the full implications for their use. While the writers’ would 
admit that perhaps most teachers do not understand the use of 


92 EVALUATING STUDENT PROGRESS 


percentile norms as fully as would an expert, the fact remains that 
percentiles are far easier to interpret to teachers, parents, and stu- 
dents than are standard scores. More teachers are going to bring real 
meaning to a score that is reported to be equivalent to the 98th per- 
centile than they will bring to a standard score of plus 2 standard 
deviations, or a T-score of 70, Thus, for the average teacher, parent, 
and student, a test that reports norms in terms of percentiles is 
perhaps more usable than one that reports only standard scores. 

In another area, the authors have long held that unless results 
obtained from rating scales can be utilized as aids to learning their 
use in schools is open to question. Some rating scales are so general 
as to defy real use in an educational sense, while others are so com- 
plex in mechanics of administration, scoring, and reporting as to deny 
any efficient use of the data. Some rating scales developed by stu- 
dents in measurement classes that contain from ten to fifteen dif- 
ferent descriptions of a single trait, such as leadership or coopera- 
tion, tend to be so unwieldy and clumsy as to be practically useless 
for the teacher. ; 

In some schools, standardized achievement tests are given on a 
Schoolwide basis, often under the supervision of the guidance 
counselor or the principal. The tests are then sent to some scoring 
service for scoring and analysis. Upon their return to the school, the 
results all too often are filed, and the teacher's contact with the test- 
ing program has been limited to the use of his time, students, and 
classroom without any adequate provision for the use of the test 
results in his teaching program. In such a case, there is doubt as to 
the usefulness of the whole testing program. If, on the other hand, 
test results and, indeed, the actual tests were made available to the 
teachers so that they could know both the total scores and how 
these scores compared with the norm group, and could also see in 
detail where individual students made their mistakes, some real use 
could be made of the testing program. Provision for the type of 
analysis described in Chapter 15 of this volume would definitely go 
far toward satisfying this criterion of “usefulness.” 

In general, there is a very strong case to be made for this concept, 
for even though the tests are valid and reliable, and we have been 
prudent with time, energy, and resources, if the end results are not 
readily and easily usable by the classroom teacher and the student, 


A MEASUREMENT RATIONALE 93 


there is a serious question as to the value of the testing activities. A 
major consideration, then, in the selection of a technique, or in mak- 
ing a choice between two techniques, would revolve around the 
question of “How can we best use these data or this technique in 
helping boys and girls toward more efficient learning?” 


CHAPTER 
6 


General Suggestions for 
Test Construction 


WHILE THE CHAPTER relating to criteria for the selection and use 
of evaluation techniques applies equally well to standardized tests, 
teacher-made examinations, and the informal methods of measure- 
ment, this chapter deals more directly with procedures for construct- 
ing teacher-made tests and inventories. More specific suggestions for 
the construction of various kinds of objective tests, essay tests, and 
the variety of informal measures will be given in the chapters which 
follow. 

As described in previous chapters, evaluation procedure begins 
with the statement of general objectives for the school; these gen- 
eral objectives then are defined more specifically for each subject- 
matter area or level of instruction. It is at this point that the 
objectives are described in terms of the actual behaviors teachers 
assume will result from the instructional process, behaviors being 
used in the broadest sense to include not only overt behaviors but 
such mental processes as thinking, knowing, and reacting. Having 
defined the expected outcomes of instruction in such terms, the 
teacher then must decide what content will best lend itself to the 
attainment of these objectives, content being used as a means to an 
end rather than an end itself. This is not to imply that content is 
not important, for little could be done in any educational process 
without content or materials with which to work. 

By means of the two-dimensional chart or grid, teachers attempt 
to insure that items measuring both content and behavioral ob- 


94 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 95 


jectives will be included in each area in proper proportion. By thus 
insuring that a test adequately covers the subject matter and that it 
is directly concerned with hoped-for student behaviors, the teacher 
has satisfied the most important criterion of the test—validity. 


N ECT! 
CONTENT OBJECTIVES | AREA AREA AREA AREA AREA TOTAL 

1 2 3 4 5 (by objectives) 
BEHAVIORAL OBJECTIVES 


I. Acquire knowledge of 


A. Hur A 25 
B. SS MN IS 25 
G OO Cae 10 


II. Apply basic principles 
A. 


4 2 3 1 10 

B. 0 3 3 3 10 

C 2 3 1 3 1 10 

III. Critically analyze issues 1 2 1 S 3 10 


Total (by content. 
objectives) 20 20 15 25 20 100 
Total number of 
items in test 


Fig. 4. Two-dimensional grid chart 


Figure 4 shows how a grid can be used in preparing an examina- 
tion in high school biology. Listed across the top of the figure are 
five content, or subject-matter, areas, which may represent such 
topics as human reproduction, nutrition, heredity, and so on. Listed 
on the vertical axis of the chart are seven over-all behavioral ob- 
jectives for the course, three of which deal with various kinds of 
knowledge (note that the objectives are in terms of an expected be- 
havior—the student should acquire knowledge, and so on), three of 
which spell out specific objectives dealing with application of basic 
principles, while the seventh states the objective, “Critically analyze 
issues.” 


96 EVALUATING STUDENT PROGRESS 


The actual process of test construction proceeds from this basic 
grid to a weighting of the importance of each area and behavior. In 
the example, the teacher has decided that the test should contain 
one hundred items. In his estimation, “knowledge” items should ac- 
count for 60 per cent of the test, distributed 25 per cent to objective 
area 1, 25 per cent to area 2, and 10 per cent to area 3. Thirty per 
cent of the test should be devoted to “application” items, this 30 
per cent being divided equally between the three objectives listed. 
The remaining objective dealing with critical thinking is weighted 
at 10 per cent. (It might be expected that at a later stage in this 
course the weightings might change to assign more importance to 
application and critical thinking objectives.) In the same way as the 
behavioral objectives are assigned weights, so each of the content 
areas is to be represented in the test in proportion to the import- 
ance each area is assigned by the teacher. Within each cell of the 
grid the teacher then decides how the column and row totals should 
be distributed. In the example, he has decided that he should have 
five items that test for content area, 1 and which deal with behavior 
1A; that he should have another five items that also deal with con- 
tent area 1 but with reference to behavior 1B, and so on. Thus, each 
potential item for the test is classified in relation to specific subject 
matter and a specific behavior. With this outline, or table of 
specifications, for the finished test, the teacher can then turn to 
actual item construction. 

It is at this point that the teacher must decide which of the 
variety of test items will be most appropriate for his purposes. 
Should the test consist of all true-false items, multiple-choice items, 
or completion items? Is there a place for “free-response” items? 
The answers to such questions obviously depend upon the objectives 
of the course and the skill of the teacher in constructing the various 
kinds of items. Specific suggestions for the use and construction of 
test items and other evaluative techniques will be given in later 
chapters, the present remarks being intended only as general guides 
in test planning. 

One of the biggest errors teachers make in test planning is their 
tendency to wait until shortly before an examination is scheduled 
to begin to write the items to be included in the test. Often, the 
press of other duties seems much more important so that the actual 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 97 


item writing is put off until the last minute. The result usually is 
that too many of the test items are poorly thought out, contain 
ambiguous terms, and in all too many cases, involve petty details 
instead of the more important and pervasive outcomes of learning. 
Constructing test items that measure up to the standards of validity, 
reliability, and objectivity demands time and energy. Even with 
professional test makers such as are employed by civil service com- 
missions, it is the exceptional item writer who can consistently turn 
out ten or more good items in a day. Most will average below this 
figure. It would seem somewhat unrealistic for the average teacher to 
expect that he can devise a valid test of fifty to sixty items if he 
waits until several days or even a week before the test is to be 
given to begin writing his items. The solution to this problem lies 
in planning ahead and in spreading out the item-writing assign- 
ment over a long period of time, utilizing such aids as item pools. 

The best time to write an item dealing with a specific area of 
content or with a particular behavior is at the time the objec- 
tive is dealt with in the classroom. At this time, the item appears 
in its complete context, and the various implications of the item 
and its direct relationship to other parts of the instructional unit are 
most clear to the teacher. Thus, the authors have long suggested that 
a part of every teacher's day should include some time for item 
writing, with particular emphasis upon the class work accomplished 
that day. Even if the teacher only produces one or two items on 
some days, the steady accumulation of items written while the “fire 
is hot” soon produces a sizable pool of items which can then later 
be incorporated into a test. Many teachers have found it useful to 
write items each day on separate cards or on half sheets of paper. 
A sample of such a card is shown in Figure 5. This type of card, 
which can be easily mimeographed, has a space for the course name, 
a code to identify the item by behavior and content area (the code 
used in the example refers to the listings in the two-dimensional 
chart shown in Figure 4), and a place for noting when the item was 
used (this one was first used on 2/4/54 and was repeated on 3/7/56), 
while the major space on the card is used for writing the complete 
item and the key. 

If items are written daily and collected on such cards with proper 
classification noted as to the area of content and behavior measured, 


98 EVALUATING STUDENT PROGRESS 


it then becomes a simple matter of building at least the major por- 
tion of any new examination from such a file. It is possible to de- 
velop a test file of several hundred items for any particular objective 
or content area. With such a large number of items to choose from, 
no Single test item need be repeated more than once in every other 
year. 


Biology 1A-3 _ 
Course Behavior-Content Code 
When Administered Which one of the following includes 
all the others? 
2/4/54 
3/7/56 1. Order 
2. Species 
3. Class 
4. Family 
5, Genus 


Item Key 


Fig. 5. Test item pool card 


ADMINISTERING THE TESTING PROGRAM 


Proper administration of teacher-made examinations is a must if 
examinations are to be valid and reliable measures of student prog- 
ress. Faulty mechanics of test preparation and administration can 
negate the value of the most carefully thought out and planned test. 
Thus, some general considerations bearing upon this problem will 
be discussed in the following pages. These comments are intended 
as guides with respect to teacher-made tests, further comments and 
suggestions with reference to the administration of standardized 
tests and informal measures being reserved for later chapters where 
these techniques are discussed more fully. 

Assuming that the teacher has planned his test carefully with 
proper attention paid to the objectives and content to be measured, 
and assuming further that he has assembled the required number 
of well-prepared items to satisfy the requirements of a planning 


"(t 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 99 


grid, the problems of actual reproduction and administration of the 
test remain. The following suggestions are offered as means of over- 
coming such mechanical problems. 

1. Reproducing examinations. The primary criterion to be con- 
sidered in the reproduction of tests has to do with clarity of pres- 
entation. Tests should be prepared so that they are easy to read 
and directions should be very carefully stated so that students will 
know immediately what is expected from them. Objective tests, in 
most cases, should be duplicated using a ditto machine, mimeo- 
graph, or any other suitable method that is available in the school. 
The exceptions to this general rule might be short class quizzes of 
perhaps no more than ten items, in which case the items may be 
written on the chalkboard. Items should be written on only one side 
of the paper, for often printing on both sides of a page tends to re- 
duce the clarity of the print and makes reading difficult. As a gen- 
eral rule, items of different types should not be mixed together, it 
being far better to include all true-false items in one section, 
multiple-choice items in another, and so on throughout the test. 

No set rules can be given as to the arrangement of similar type 
items within sections of the test, although it is recommended that 
the easiest items in each portion of the test should come first, fol- 
lowed by the more difficult items. This gives all students a sense of 
accomplishment at the beginning of the test and encourages them 
to go on. Beginning a test with the more difficult items has the 
effect of discouraging some students at the very outset. Whether 
one chooses to arrange items in order of difficulty throughout the 
test, regardless of objective or area of content measured, or to ar- 
range items separately by area and/or objective, is a matter of in- 
dividual preference. While the former plan allows the student to 
attempt all the easier items first and then to proceed to the more 
difficult ones, the latter plan has value in that there is a commonal- 
ity of intent within the sections of the test that may not only give 
psychological assurance to the student but which also facilitates 
scoring and analysis of the examination. 

In the case of multiple-choice items, each item should be com- 
plete on a single page. If there is not enough room on the bottom of 
the page for the complete item, it should be started on the next fol- 
lowing page. The same would be true for matching items, all of the 


100 EVALUATING STUDENT PROGRESS 


various alternatives in both columns being placed on a single 
page. In most cases, it is advisable to set up multiple-choice items in 
double columns on the page unless the lead or stem of many of the 
items is quite long. A two-column arrangement in which all of the 
alternatives for each question are listed one below the other, a 
double space being left between items that are themselves single- 
spaced, is generally more economical of space than is the arrangement 
in which each item is written completely across the page. A procedure 
to be carefully avoided in setting up multiple-choice items is the 
tendency to list all of the alternatives for an item in paragraph 
style. Each alternative should occupy a separate line. Whether one 
uses small letters or Arabic numerals to identify the various alter- 
natives, answers will depend to some degree on the type of answer 
sheet used, neither letters nor numbers having any special ad- 
vantage. 

If students are asked to fill in a series of blanks or if free-response 
items are used, sufficient space must be left for the fullest expected 
answer, with allowance given for fairly large writing. Nothing can 
be more frustrating than to be asked a question that demands a 
fairly full explanation with but a single line or two provided for the 
answer. 

2. The use of separate answer sheets. With most standardized 
tests, it has proved to be most economical of time, energy, and 
money to use separate answer sheets. Not only does this permit the 
test booklets to be used over and over again, but the actual scoring 
of such answer sheets, as noted before, is much easier than scoring 
answers written directly in the test booklets. With most teacher- 
made tests the problem is somewhat different, for few of the tests 
of this kind are intended to be used more than once, unless there are 
two or more sections of a single class or course. In this latter case, 
there is a possible parallel to the use of standardized tests. With a 
test intended to be used just once, however, one must consider not 
only the possible saving of time that would come about through the 
use of separate answer sheets, but also the opposing factor of in- 
creased cost that would ensue in preparing not only a test booklet 
but an answer sheet as well. One further, and very important, 
factor must also be considered. If the test is to be used as a teaching 
device, and the authors would hold that this is the primary func- 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 101 


tion of any measurement, actual learning may be facilitated if stu- 
dents enter their answers directly in the test booklet. Then, when 
the test is reviewed in class, the student can see at a glance not 
only the question asked but also his answer. In addition, a record 
of the student’s errors will be available for inclusion in the cumu- 
lative folder, thus facilitating possible remedial action. Far too often 
when separate answer sheets are used, scoring procedures are em- 
ployed which result only in a score, with no identification of items 
answered right or wrong being possible. This is a particular weak- 
ness in machine scoring of objective examinations. With hand scor- 
ing of separate answer sheets, or of the test booklet itself, the 
missed items may be used as the basis for remediation or reteaching. 

In general, the decision for or against separate answer sheets will 
have to be made in terms of efficiency and/or in terms of the use to 
be made of the tests, once given and scored. For most class tests 
given but once and involving groups of twenty-five to thirty stu- 
dents, assuming that the tests will be reviewed for learning purposes, 
the evidence tends to favor recording answers in the test booklets, 
although there are also arguments in favor of separate answer sheets. 

3. Directions to students. One of the most important parts of any 
test is the explanation of what the student is to do. Here he is told 
how he should record his answers, what time restrictions apply, spe- 
cial directions about guessing, and a host of other kinds of infor- 
mation that make clear the student’s responsibilities with respect 
to the test. 

In most cases it is not necessary to state to the student the pur- 
pose of the test, although this may well be done. It is assumed that 
the teacher will have set the stage for the testing days in advance 
so that the students will be psychologically prepared for the kind 
of test they are to take. Students should be specifically told if they 
are to use separate answer sheets or if they are to write in the book- 
let. Any special methods of recording answers should be clearly ex- 
plained. If the test is divided into sections with different types of 
items in each section, separate sets of directions for each part should 
be given. Students should be told in the case of multiple-choice items 
if they are to select the “correct” answers or if they are to select 
the “best” of the alternative answers. The latter assumes that more 
than one correct, or plausible, answer is given but with varying de- 


102 EVALUATING STUDENT PROGRESS 


grees of correctness. In the case of true-false items, students should 
be told whether they should consider the statement to be true, as an 
absolute, or true, with some reservations, with similar directions for 
items that may be false. With matching items, they should be told 
whether any of the items to be matched may be used more than 
once or if each item is to be singly matched. 

If there are time limits on the total test or on parts of the test, 
they should be specifically defined in the directions to students. 

Directions about guessing should also be included. Especially is 
this so if any sort of correction formula is to be used. While many 
test experts and statisticians argue that formulas be used with ob- 
jective tests to correct obtained scores for supposedly chance suc- 
cesses, it is the feeling of the authors that there is more to be said 
against such a practice than for it. The basic assumption made to 
support correction for guessing is that in a five-part multiple-choice 
item, for example, the student has a one in five chance of getting 
the item right even though he has not the faintest notion of the cor- 
rect answer. Thus, the advocates of this system hold that the number 
of correct answers on a test be reduced in proportion to the num- 
ber of wrong and supposedly guessed answers. The present authors 
would hold that there are few times when pure chance enters 
into testing in line with this assumption. Even with true-false items 
in which the student supposedly has a fifty-fifty chance of get- 
ting the item right just by guessing, there are probably few cases 
in which this fifty-fifty proportion holds. The student’s possible 
knowledge of an answer runs from 100 per cent confidence in the cor- 
rectness of an answer through zero confidence. For some items, the 
student may be “pretty sure,” or he may only have a hunch as to the 
correct answer. In these cases, while the student doesn’t really know 
the answer, his response cannot be treated as a pure guess. Thus, 
the assumption underlying correction for guessing is challenged. 
Beyond this, the writers also hold that telling students not to 
attempt an answer unless they are absolutely sure that they are 
right sets up a negative block that is contrary to most that is known 
about the psychology of learning. For many students who tend to be 
on the introverted side, directions advising against guessing, lest a 
penalty be attached to their scores, in effect penalizes them for hav- 
ing a certain psychological structure over which they have no control. 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 103 


A third factor that the authors feel should be considered deals 
with the artificiality set up by directions against guessing when 
educational practice and real-life practices are compared. How 
many times during an average day do we make decisions in which 
we have 100 per cent confidence or 100 per cent knowledge of all of 
the factors involved ? Couldn’t we truthfully say with regard to many 
of the decisions we are forced to make that in many we acted “to 
the best of our ability even though we weren’t absolutely sure”? Why 
then should we correct test scores to remove effects of guessing, hav- 
ing already admitted that much of what we call guessing does not 
fit the laws of chance? It would be the considered opinion of the 
writers that students should be encouraged to try every item in the 
test and to use to the fullest their knowledge in trying to answer 
even those items about which they are not absolutely sure. 

4. Test length in relation to time. One of the continuing argu- 
ments about testing refers to the relative value of power tests, those 
in which all students are given as much time as they need to com- 
plete all items, as opposed to speed tests, tests with certain definite 
time limits that may operate to the disadvantage of the slower 
thinking or acting student. As with so many other administrative 
details, the ultimate answer to any particular situation goes back 
to the objectives defined by the teacher. If the objectives define not 
only acquisition of knowledge or ability to communicate clearly 
but also infer or state some standard of efficiency or speed, the 
choice must be in favor of the “time” test. On the other hand, if 
the objectives stress complete understanding without regard for 
speed, then a power test is called for. In general, and there is no 
final answer in many cases, the authors would suggest that insofar 
as possible, all students be allowed a “reasonable” amount of time 
to consider all items in the test. While it must be recognized that 
some students may ask for unreasonable amounts of time, up to 
three or four times as long as the fastest students, everybody should 
at least have a chance to demonstrate his competency or skill (as- 
suming the speed factor is not a criterion of success). If such is not 
true, the validity of the test may be questioned. A middle ground 
is suggested in some quarters with the suggestion that while rigid 
time limits not be set, the examination be halted after 90 to 95 
per cent of the students have finished. While this would deny the 


104 EVALUATING STUDENT PROGRESS 


slowest working students an opportunity to attempt all of the test 
items, this suggestion does recognize the problem of what to do 
with those students who finish the test early and then fidget while 
the rest of the class continues to work. While there are various ways 
of keeping these students occupied, it is not reasonable or practical 
in most school situations to require all students to wait until the 
last one has finished the examination. Some arbitrary time limit 
is sometimes required. 

One suggestion in connection with test timing that has wide ap- 
plication refers to the practice of keeping students informed during 
the test as to what time is available for completion of the test. 
While some students will unfortunately get somewhat panicky as 
time passes and they have not yet completed what they think they 
should have completed, the majority of students will profit from 
knowing how much time remains at various intervals during the 
testing session. 

Gauging the amount of time needed for the completion of a 
particular test is an ability that comes with experience. While cer- 
tain general yardsticks have been proposed from time to time, i.e., 
true-false items can be answered at the rate of four to five per 
minute while multiple-choice items require about a half minute 
each, so much depends upon the kind of content dealt with and the 
skill of the teacher in writing clear and unambiguous items that 
these general rules have but limited application. Each teacher must 
determine through experience the time limits which are practical 
for his students and his type of test. 

5. Scoring the examination. Whether the teacher uses standardized 
tests with separate answer sheets, has his students record answers 
directly in the test booklet, or develops his own test procedures, 
there arises a need for some method of efficient scoring. 

In large school systems, it has been found both convenient and 
economical to rent IBM test-scoring equipment so that tests ad- 
ministered with separate answer sheets can be machine scored. De- 
pending upon the test and the skill of the person doing the scoring, 
from two hundred to four hundred answer sheets per hour can be 
scored with such a machine. Not only can this scoring be done rap- 
idly, but it can be done by a skilled clerk, thereby permitting the 
teacher to do something more creative with his time. Rental costs 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 105 


for such a machine tend to be heavier than the average school can 
afford, so an in-between step has been developed in which the test 
scoring is done by a commercial agency. Such a test service bureau, 
sometimes located at a state university testing center, will score 
tests for public schools at a nominal fee, an arrangement which is 
usually more economical and practical than renting the scoring 
equipment. 

In all of these machine-scoring operations, which incidentally de- 
mand that all answers be recorded by the student with special heavy 
graphite pencils (thereby introducing another item of expense), the 
test papers are scored using the key supplied by the teacher (or in 
the case of a standardized test, by the publisher) and a single score 
is entered on the answer sheet. Part scores can also be obtained if 
separate keys are made for each part score and the papers are run 
through the scoring machine once for each part. While total scores, 
or part scores, are, of course, necessary and useful to teachers, this 
method of scoring does not mark or note where the student has 
made his errors. On a test of one-hundred items, a score of 76 means 
that the student missed twenty-four items, but this procedure does 
not permit an identification of the items missed, so that the student 
can review his test to learn those items he had missed. Thus, unless 
the papers are later gone over either by the teacher or by the students 
in a group and the wrong answers are identified, directed learning 
based upon the examination results is hindered. 

Separate answer sheets can be scored by hand as well as by 
machine, although it must be recognized that this is a much more 
time-consuming process. In the absence of a scoring machine or a 
service contract with a testing bureau, hand scoring of objective 
examinations using a scoring stencil is the most efficient method. 
Common procedure in such scoring calls for the punching out of a 
blank answer sheet (or a special IBM stencil mat), the blank spaces 
corresponding to the correct answers for each item. This mat is then 
placed over each of the answer sheets in turn and the number of 
marks that show through the punched holes becomes the person’s 
score. While a total score is thus easily determined with a minimum 
of time, the authors would suggest that class discussion and post- 
test teaching can be greatly facilitated at the expense of some time, 
if at the time of scoring each wrong answer is checked. This can 


106 EVALUATING STUDENT PROGRESS 


be done very simply by marking with a red pencil through each 
punched hole that does not disclose a student response. Thus, in 
item 24, for example, if the student checked response number 3 and 
the key (the space punched out on the scoring mat) was also num- 
ber 3, his response would show through and would be counted as an 
item right. If on item 25, the student again marked number 3 but the 
key was number 2, then a blank space would show through the 
scoring mat. The teacher would then put a small check mark in 
red through the punched out space. In this way, each wrong re- 
sponse could be brought directly to the attention of the student 
and he could then be made responsible for his own review and learn- 
ing of those concepts and understandings that he had missed during 
the examination period. 

The above procedure assumes that definite provision will be made 
for review of the test, at which time the student can either indi- 
vidually or in class be given the opportunity of taking the test book- 
let and his answer sheet to see where he made his errors. As noted 
in a previous section of this chapter, this same objective of test re- 
view may be approached through the use of test booklets that re- 
quire the student to record his answers on the booklet. Scoring, in 
this case, is a much more time-consuming procedure than scoring 
separate answer sheets. Many teachers believe, however, that there 
is an advantage in marking or noting errors made by students right 
on the test booklet, because it facilitates review and learning by 
having both test item and response side by side on the same page. 

In the comments to this point, little has been said about one of 
the most important scoring resources of the school—the students 
themselves. Assuming that the teacher has accepted as a primary 
purpose of evaluation the goal of helping students to know their 
strengths and weaknesses so that they can help themselves, and 
assuming that students have accepted this purpose as their own and 
are ready and willing to look at themselves objectively, full ad- 
vantage of this resource can be taken. After the examination has 
been concluded and as it is being reviewed in class (either in the 
same or in the immediately following class meeting), students may 
profit from scoring their own test papers as each item is discussed. 
Granting that there may be individuals who may attempt to change 
answers at this time to improve their score and standing within 


GENERAL SUGGESTIONS FOR TEST CONSTRUCTION 107 


the group, the atmosphere in which the teacher uses evaluation to 
help students should be one that is conducive to student honesty and 
integrity. 

The process of having students score their own papers not only 
eliminates the necessity of the teacher’s putting in hours of valuable 
time in scoring, but has the very decided advantage of permitting 
students to see and evaluate their own errors as the test is reviewed. 

6. The reuse of tests and test items. One of the major advantages 
of utilizing an item pool, or some other means of systematically 
writing and collecting test items throughout the semester or year, 
is that an available supply of test items would be always on hand 
for class use. With such a pool or file of items, the temptation to 
repeat the same examination is minimized. Students could not rely 
upon a cursory study of a previous examination but would have to 
be prepared for a comprehensive survey of an area. 

With a large enough collection of test items appropriate to all of 
the objectives of the course available, the premium placed upon 
securing last year’s examination would be eliminated. It is recom- 
mended that files of previous examinations be made available to 
students to enable them to study sample items of the type that the 
teacher thinks are important. Such a procedure would help students 
to study much more efficiently and would encourage initiative and 
responsibility. Since any one item would not be used more than 
once in any year, and complete blocks of items would never be re- 
peated, the value of memorizing specific items would be meager. 
Study of test items is an excellent review technique as long as the 
study of actual items does not become an end in itself. 

Tf tests and test results are used only in an administrative sense, 
that is, with little regard for the human values involved in evaluation 
of student progress, then there is every reason for students to fear 
evaluation as something being done £o them. If, on the other 
hand, tests and evaluation procedures are focused on how the re- 
sults obtained can be used with and for the students, as means of 
helping them to understand their status and progress with regard 
to the objectives of instruction, then there is every reason to expect 
that students will favor frequent testing. 

Students can be helped to understand themselves through a wise 
use of tests and other evaluation devices. It is vitally important that 


108 EVALUATING STUDENT PROGRESS 


teachers understand that tests are not clubs to be held over the 
heads of students and that test results are not a means of securing 
control in the classroom. Teachers need to learn to treat the results 
of testing as analyses of personal strengths and weaknesses, as evi- 
dences of growth and progress toward the objectives of the school, 
and as a means of facilitating learning and progress toward ma- 
turity. 


CHAPTER 
7 


Constructing and Using Objective Tests 


TEACHERS FREQUENTLY ASSOCIATE the concept of measurement and 
evaluation with the giving of tests. The tests, prepared, administered, 
and scored by the individual teacher, are looked upon as the primary 
means of securing information about students so that a grade can 
be entered on a report card at specific periods during the course of 
the school year. There is little doubt that the teacher-made test does 
serve this function. To effectively and efficiently use teacher-made 
tests, however, consideration must be given to several important 
questions : 

1. What unique purposes are served by using teacher-made tests? 

2. What are the different types of test items that the teacher can use 


to measure pupil progress? 
3, What are the advantages and limitations of the various types of 


test items? 
4. How can testing by means of teacher-made tests be made more 


objective? 
Adequate answers to the above questions should enable the teacher to 
construct instruments that will meet the tests of validity, reliability, 
objectivity, efficiency, and usefulness described in Chapter 5. 

It is probably correct to assume that most of the teacher’s evalua- 
tion of his students at the present time is based upon the results of 
self (teacher) -made tests. If the assumption developed in the earlier 
part of this book, i.e., that to be most effective the process of meas- 
urement and evaluation must make use of a variety of techniques, 


109 


110 EVALUATING STUDENT PROGRESS 


is accepted, then it becomes important to see that the test made by 
the teacher is only one technique of measurement that can be used 
and that it is a technique which serves many purposes. The unique 
purposes that may be served by using teacher-made tests in the 
classroom are to permit the teacher to— 


1. Measure and appraise student progress in terms of specific class- 
room objectives. 

2. Provide motivation for learning of specifics, as well as generali- 
zations. 

3. Secure evidence of individual and group strengths and weaknesses 

while the students are at work upon a specific unit. 

Test frequently. 

Use a relatively inexpensive form of measurement. 

Secure specific information for reporting purposes. 

Locate evidence that will be useful in making immediate modifica- 

tions of curriculum and instructional procedures. 


xoc. 


Informal, teacher-made tests are used because they enable the 
teacher to engage in continuous appraisal. There are, however, sev- 
eral limitations to the use of such tests that must be recognized if 
the teacher is to make effective and efficient use of this means of 
measurement. The foremost limitation to the use of teacher-made 
tests is the inadequate knowledge of most teachers concerning the 
principles of test construction. Test construction is a skill that can 
be learned, but, just as it takes time and practice to learn how to 
read, hit a golfball, or sew a skirt, it takes time and practice to learn 
the skills of test construction. Another limitation that needs to be 
recognized is that good test items may take considerable time to pre- 
pare. Since there are limits to the time that is available to teachers, 
the problem of finding time to construct items may pose difficulties 
for some teachers. If, however, the teacher constructs test items on a 
daily basis, this limitation can be overcome. 

In several other respects there are limitations to the use of the 
teacher-made test. While it is not impossible to determine the 
reliability of such tests, it is highly improbable that teachers will 
do very much of this type of analysis. As a result, many teacher- 
made tests are far less reliable than are the standardized tests. This 
lack can be overcome as teachers become more proficient in pre- 
paring test items. Another of its shortcomings is that comparable 


CONSTRUCTING AND USING OBJECTIVE TESTS 111 


forms are not readily available. Not being in a position to engage in 
elaborate standardization procedures, the teacher will settle for 
one form of a test. Because there is only one form, it may be diffi- 
cult for the teacher to determine the pupil growth that has actually 
taken place over a period of time. The lack of norms may also be 
considered a limitation, although each teacher may rather easily 
develop his own set of norms for his classes. 

If it is remembered that the teacher-made test is only one of the 
several techniques of measurement that can be used during the 
course of the school year, then the limitations described above are 
not totally detrimental, although they must be understood if the 
teacher-made test is to be used properly. 


USING OBJECTIVE AND FREE-RESPONSE TESTS 


In Chapter 5 the concept of objectivity was held to encompass 
two main ideas: (1) clarity of item meaning, and (2) the estab- 
lishment of a set of conditions conducive to uniform scoring 
and interpretation. Tests containing true-false, multiple-choice, 
and matching items are referred to as objective tests, while those 
containing completion and essay test items are classified as sub- 
jective tests. This distinction is somewhat superficial, since it is 
possible for true-false items to contain factors that make them 
subjective. It is also possible for essay tests to be constructed so 
that the questions and problems are clearly defined and standard 
procedures for their interpretation are prepared, thus increasing 
objectivity. 

There is perhaps no completely satisfactory means of classifying 
test items, and the authors, for the purpose of clarification, make a 
distinction between objective and free-response test items as fol- 
lows: An objective test item is one in which the person taking the 
test is restricted in his choice of response. For example, the student 
answering any of the following questions must check one of the four 
alternatives presented as an answer to each problem. The student 
has no chance to take issue with the statement or any of the alterna- 
tives. 

Examples of objective test items: 


The teachings of Confucius have influenced Chinese history by— 
1. Persuading the Chinese to worship only one God. 


112 EVALUATING STUDENT PROGRESS 


2. Popularizing the idea that the end justifies the means. 
3. Slowing the rate of adjustment to modern conditions. 
4. Encouraging the movement for national unity. 
5. Sponsoring vocational schools for the peasants. 


A man is found unconscious on the floor beside a gas stove from which 
gas is issuing. The most important thing to do is— 


1. Call a doctor. 

2. Take the man to a hospital. 

3. Throw cold water on the man. 

4. Get the man into fresh air and start artificial respiration. 
5. Rub the man's arms and legs. 


A free-response type test item is one in which the person answer- 
ing the item is not restricted to a prefabricated list of responses 
from which to select his answer. The student taking a free-response 
test is free to analyze the question in any manner he desires pro- 
viding he stays within the limits of what is asked for by the test 
maker. 

Examples of free-response test items: 


1. Why are relatively fewer victims of polio permanently crippled 
now than was the case a generation ago? 

2. Compare the objective examination with the essay type of test 
from the standpoint of scoring. 

3. Summarize the events that took place after 1931 that led to the 
outbreak of World War II in 1939, 


By using this distinction between objective-type test items and 
free-response test items, it is somewhat easier to understand the 
generalization that all types of test items can be made objective. 


CONSTRUCTING A TEST 


Once having decided to test a group of students, the teacher needs 
to bring into action a series of steps that will result in the con- 
struction, administration, and interpretation of a test. The pro- 
cedure that the teacher will need to use has been fully discussed 
in the preceding chapter and it is desirable to review those generali- 
zations at this point. They can be summarized as follows: 


1, Decide as specifically as you can the aspects of student growth 
that you are trying to evaluate. Whenever possible it would be 


CONSTRUCTING AND USING OBJECTIVE TESTS 113 


highly desirable to make use of the two-dimensional chart of ob- 
jectives that has been described in chapters 4 and 6. 

2. Determine the kind of test that you want to construct. The type 
of items that you use will depend to a large extent on the purpose 
of the test, the objectives that you are trying to evaluate, and the 
students that are to be tested. 

3. Use your own test file plus test items available from other 
sources, and assemble the test. 

4. After carefully checking the test copy, being sure that adequate 
directions are included, have the test reproduced and made ready 
for administering. 


THE TRUE-FALSE ITEM 


Probably the most commonly used objective-type test item is the 
true-false, or alternative-response, item. While the primary method 
of using the true-false item is to present a statement and then to 
ask the examinee to mark the statement T or F, there are many 
variations of this pattern as can Sis seen from the examples that 
follow. 


Read the following statements and then circle the Dif the statement 
is true or circle, the Dit the statement is false. 
T F 1. Des Moines is the capital of Iowa. 
T F 2. It is desirable to include butter in the diet because it is a 
good source of vitamin D. 


Read the following statements and then circle the word yes if you 
agree with the statement or the word no if you disagree with the state- 
ment. 

YES NO 1. The President of the United States can only hold 

office for two years. 

YES NO 2. Itis desirable for all children to receive a Schick shot 

before they enter kindergarten. 


Read the following statements and if you think the statement is cor- 
rect as it is presented place a plus (+) mark in the appropriate space 
and if you think the statement is incorrect place a minus sign (—) in 
the appropriate space. 

—  —— 1. The factors of x? +x — 2 are (x+ 2) and (x — 1). 

— 2. When a brown-eyed male and a brown-eyed female produce 
offspring there is one chance out of four that a blue-eyed 
offspring will be produced. 


114 EVALUATING STUDENT PROGRESS 


There are many other variations of this pattern such as Right- 
Wrong, Correct-Incorrect, Agree-Disagree. There doesn’t appear to 
be any unique advantage to any of these combinations when the 
test maker is seeking either of two responses. The guiding consid- 
eration when using this form of test item is the ease with which 
the person answering the questions or responding to the statements 
will be able to understand what is wanted in the way of a response. 

Since the item appears to be a fairly easy one to construct and 
score, it has enjoyed a great deal of popularity among teachers. It 
is necessary, however, to properly assess the advantages and limi- 
tations of this type of item before using it indiscriminately. The 
advantages commonly associated with the true-false item are as 
follows : 


1. Arelatively large sample of subject matter can be covered, whereas, 
with other types of test items, it would take double or triple the 
amount of time to cover the same material. 

2. Tests of this type can be scored objectively and efficiently. This is 
not true of all types of test items. 

3. Items can be constructed quickly and easily. (It needs to be 
emphasized that many test experts maintain that to construct good 
true-false items is not easy.) 

4. It is possible to construct items of this type for almost all types of 
factual material. 


Travers, in his book How to Make Achievement Tests, has 
summed up the limitations of the true-false item as follows: 


1. The true-false item usually fails to present a realistic type of prob- 
lem, that is to say, a problem similar to those which the educa- 
tional program has taught the student to solve. Rarely in life is a 
person faced with the problem of deciding whether a statement is 
true or false. 

2. The true-false item tends to be what has been called textbookish, 
that is to say, it tends to measure the extent to which the student 
remembers a particular textbook he has read. 

3. The true-false item is limited in the outcomes it can measure. Be- 
cause of this limitation, most tests consisting only of true-false 
items measure only a few of the outcomes which it would be de- 
sirable to measure. Usually, such tests do not measure some of the 
more important outcomes but are limited to outcomes related to 
the acquisition of the terms and concepts in the field. 


j| 
4| 


CONSTRUCTING AND USING OBJECTIVE TESTS 115 


4. A true-false item provides rather low reliability per item. Of 
course, it must be remembered that even if true-false test items 
can be made reliable, their validity and comprehensiveness may 
often be questioned. In this connection, it may be noted that the 
type of true-false test commonly made by teachers consists of only 
fifty to seventy-five items and is, in most cases, practically worth- 
less as a measuring instrument. 

5. Contrary to common belief, good true-false items which measure 
significant outcomes are particularly hard to write.! 


Suggestions for writing satisfactory true-false test items 


While not all of the limitations described above can be corrected, 
it is possible to prepare true-false test items that can be useful in 
analyzing what students have learned over a specific period of time. 
These suggestions need to be considered totally, for gross violation 
of any of the suggestions will lead to unsatisfactory test items. 

1. Write statements that encourage students to apply what they 
have learned. It is a relatively simple matter to copy sentences 
directly from a book or to modify them slightly and in this way 
prepare a: test. Questions prepared in this manner, however, gen- 
erally only test memorization or the ability to recall what the 
author has said. By using verbatim quotations from the book, the 
tester may only be testing the student’s ability to recognize or re- 
call isolated bits of information and not the student’s actual under- 
standing of the subject matter. 

The test author must recognize the desirability of writing items 
that enable the student to demonstrate competency in applying what 
has been learned. Test-item construction is not merely a matter of 
“don’t write test items by copying directly from a book,” but more 
positively “write test items that require the student to use what 
he has learned.” 


Poor— 
T F 1. The formula for finding the area of a rectangle is 4 = LW. 
T F 2. A “catalyst” is a nonreacting substance in a chemical so- 
lution. 
T F 3. The authors of our textbook are college professors. 


1 Robert M. W. Travers, How to Make Achievement Tests (New York: Odyssey 
Press, 1950), pp. 50-51. 


116 EVALUATING STUDENT PROGRESS 


T F 1. Thearea of a rectangular surface 6’ X 4’ is 10 square feet. 

F 2. When a piece of copper is added to an acid solution the 
resulting reaction indicates that copper is a catalyst. 

F 3. When evaluating a textbook it is important to secure bibli- 
ographic data about the authors. 


2. Write statements that are either completely true or completely 
false. One of the major faults associated with true-false test items 
is that the student can frequently guess at the correct answers be- 
cause the test author has provided him with many clues. Words such 
as “all,” “always,” “none,” “no,” or “nothing” are frequently as- 
sociated with statements that are false. Other words such as “may,” 
“should,” “sometimes,” “often,” or “as a rule" are frequently as- 
sociated with statements that are true. This difficulty can be avoided 
by the test maker if he is careful in using words that do not provide 
unwarranted clues. 

Since much of human behavior represents neither black nor white 
but consists of various shades in between, it may often be difficult 
to phrase statements that students can answer “true” or “false.” 
Students are often confronted by vague generalities on true-false 
tests that can only be answered correctly if the student reads the 
mind of the instructor. The cure for this problem is to use the true- 
false test item only in situations where a completely true or a com- 
pletely false answer can be given. 


Poor— 
T F 1. Jazz music is always associated with Negro singers and 
instrumentalists. 
T F 2. Artisan expression of one's emotions. 
T F 3. All mutations have resulted in the development of new 
species. 
T F 4. Persons injured in accidents should never be moved until a 
doctor has approved such action. 
Better— 


T F 1. Negro singers and instrumentalists were leaders in the jazz 
movement of the early twentieth century. 

T F 2. Impressionistic art is generally interpreted to be an artist's 
feeling of what he has seen. 


CONSTRUCTING AND USING OBJECTIVE TESTS 117 


T F 3. Albinos are mutations that have resulted in the creation 
of new species. 

T F 4. Persons suffering multiple fractures in accidents should not 
be moved unless a doctor or other competent authority 
approves such action. ~ 


3. Write grammatically correct statements. While it would seem 
that teacher test constructors would readily observe this sugges- 
tion, it is often found that informally made tests contain double 
negatives, ambiguous sentences, and split infinitives. Errors in 
grammar cause the test taker undue confusion and do not fur- 
nish a valid analysis of what the student may really know. The 
remedy for this shortcoming is to be found in the careful editing 
of test items to eliminate the obvious errors. 


Poor— 
T F 1. Sociograms do not reveal the pupils whom are actively re- 
jected by the group. 
T F 2. The Supreme Court'cannot rule on a case for which no 
legal precedents have been established. 
T F 3. Anauthor cannot be considered famous if no recognition 


comes to him during his lifetime. 


Better— 
T F 1. Sociograms reveal pupils who are actively rejected by the 
group. 
T F 2. The Supreme Court can only rule on a case after legal 
precedents have been established. 
T F 3. To be considered famous, an author must have received 


recognition during his lifetime. 


4. Write statements in a direct and clear style. 'The purpose of a 
test is not to demonstrate the test maker's verbal ability or to con- 
fuse the student but to analyze the progress that has been made by 
the student. This means that the author of true-false test items must 
write items for the language level of the students, avoid complex 
Sentence structure, and use quantitative terms. The poor examples 
that follow were written for junior high school students and they 
are poor because the test author failed to take into account the 
necessity for writing statements in a direct and clear style. 


118 EVALUATING STUDENT PROGRESS 


Poor— 

T F 1. Federalism is a belief in a government with powers divided 
between a central and subordinate power. 

T F 2. Inusing the scientific method of analysis students are often 
confronted by factors (environmental conditioning, limited 
resources, cultural timidity, and personal inertia) that pre- 
vent the individual from fully exploring the basic assump- 
tions which are so fundamental to this method. 


T F 1. A federal system of government means that the national 
government divides powers with the state governments. 

T F 2. It is often difficult for a student to understand the mean- 
ing of the scientific method because he is not used to work- 
ing with that method. 


5. Write statements that emphasize main points rather than trivial 
details. The important goals of education are not found in the little 
details of American history, Algebra 1, English V, or Chemistry 1, 
but in the application of these details in the broad knowledges and 
understandings, skills, attitudes, interests, appreciations, and think- 
ing skills that are so fundamental to an educated person. True-false 
items are often used by teachers to test recall of specific, insignificant 
details rather than the major aspects of subject matter and human 
behavior. Test items asking for information found in footnotes will 
not challenge the student to develop the habit of thinking but will 
stimulate memorization and cheating, as students seek to “beat the 
teacher at his own game.” Test items indicate the interest that 
teachers have in their subject-matter specialties. 


Poor— 
T F 1. On August 12, 1865, General Custer was in charge of the 
Union forces at Fort Sumter. 
T F 2. The third movement of Beethoven’s Fifth Symphony be- 
gins with a flute solo. 
Better— 
T F 1. One of the direct causes of the Civil War was the firing 
upon Fort Sumter by troops of the Virginia regulars. 
T F 2. In Beethoven's Fifth Symphony are several examples of 


the composer’s reliance upon solo instruments to establish 
a theme. 


CONSTRUCTING AND USING OBJECTIVE TESTS 119 


6. Write statements so that controversial issues are identified 
clearly as to person, school, or philosophy. There are many events 
and ideas that are not factual but represent the opinions of some per- 
son or school or philosophy. This is particularly true in the field of 
the social sciences, although it can be equally true in mathematics 
or science. Good teachers are always aware of the danger of being 
dogmatic and, where the possibility exists that there may be more 
than one version of an event or idea, it is always better to identify 
the source than to assume that only one answer is correct. 


Poor— 
T F 1. A painting reflects the artist's impression of the world. 
T F 2. When the supply of wheat is increased, the cost of wheat 
will go down. 


Better— 
T F 1. According to Van Gogh, art is only a reflection of the 
artist's world. 
T F 2. The law of supply and demand as described by Adam 
Smith would mean that the price of wheat will go down as 
the wheat supply is increased. 


There have been many attempts to develop modified forms of the 
basic true-false test-item form. These modifications have been pro- 
posed in order to circumvent the usual limitations of these items. 
The modifications are designed to enable the student to justify an 
answer or to improve an answer so that it will become a correct 
answer. The examples which follow are some of the modifications 
that have been used. 


Directions: Each of the statements that follows is true or false. If the 
statement is TRUE circle the T, if the statement is FALSE circle the F, 
and in the space provided tell why you believe the statement is false. 

T F 1. Whole sour milk has the same food value as whole “sweet” 


milk. 


T F 2. Unpasteurized milk may be made safe for drinking by 
boiling it for three minutes, then cooling it rapidly. 


120 š EVALUATING STUDENT PROGRESS 


T F 3. Fruits and vegetables may be stored several days at room 
temperature without danger of losing vitamins. 


T F 4. Cooking vegetables in their skins destroys vitamins. 


Directions: Each of the statements that follows is true or false. If the 
statement is TRUE circle the T, if the statement is FALsE circle the F, and 
change the portion of the statement that is not underlined to make the 
statement a true statement. 

T F 1. The story of Silas Marner could not have taken place in 

modern times with modern communications. 

T F 2. The author of Silas Marner wrote of the struggles of 
middle-class English life. 

T F 3. The testimony of the witnesses in the solution of robbery 
in Silas Marner shows that most witnesses are reliable and 
accurate in their reports of events. 

T F 4. The main plot of the story of Silas Marner is the love 
affair of Godfrey and Nancy. 

T F 5. The people of Raveloe were socially equal according to the 
democratic standards of today. 


Directions: In the following statements certain conclusions are drawn 
which may or may not be true. If you think the statement is TRUE circle 
the T. If you think it is FALsE circle the F. Then circle one or more of 
the five reasons (A, B, C, D, E) which support the judgment you have 
made. In some cases there will be only one correct reason, in others more 
than one. 
T F 1. Ordinary everyday speech is monotonous, unemotional, 
and weak. 
A. The words are poorly enunciated and often irrevelant. 
B. Developing the voice is a matter of technique. 
C. Little regard is had for color, variety, and emphasis. 
D. Few people notice speech defects, 
E. Phrasing is liquid and expressive, 

T F 2. An actor must take the shortest and most direct line in 
crossing to a person, an object, or an exit. 
A. When an actor crosses behind another he loses his hold 

on the attention of the audience. 


CONSTRUCTING AND USING OBJECTIVE TESTS 121 


B. The upstage actor should walk one step in advance of 
the downstage actor when talking. 

C. The actor can circle around furniture if that circling 
keeps him out of the natural line of crossing. 

D. When an actor circles the stage he keeps the attention 
of the audience. 

E. When an actor turns his back to the audience his voice 
cannot be heard. 


Directions: Read the following paragraph carefully and on the basis 
of the information that is presented decide whether or not the statements 
that follow are true or false, If they are TRUE circle the T, if FALSE, 
the F. 


In a democratic society the rights of the individual are of primary 
importance. This right, however, is limited by a responsibility, the re- 
sponsibility for each individual to consider the rights of others. While 
the Bill of Rights makes clear our major rights, freedom of speech, , 
religion, and freedom of the press, it did not specify the obligations of 
citizenship that are associated with these rights. The obligations of 
citizenship have been developed over a long period of years by actions 
of the legislative, executive, and judicial branches of our government. 
Fundamentally, it is up to the individual to make individual rights a 
working reality. 

T F 1. Congress has enacted laws defining the responsibilities of 
citizenship. 

F 2. The right of the state over the individual is of major im- 
portance in a theory of democracy. 

F 3. Individual rights can be legislated but they can only be 
effective if individuals make them work. 

F 4. The Bill of Rights identifies the rights and responsibilities 
of American citizens. 

F 5. The executive branch of government has had a part in 
identifying the responsibilities of citizens. 


[xen hie ther jell 


While the true-false test item has serious limitations, it is possible 
for the teacher to use this form of testing to construct items that 
will challenge the best that students have to offer. The generalized 
rules enumerated above can be helpful to the teacher wanting to 
write good true-false test items. 


122 EVALUATING STUDENT PROGRESS 


MULTIPLE-CHOICE TEST ITEMS 


Experience and research have demonstrated that the most versa- 
tile and useful form of test item is the multiple-choice item. Pub- 
lished tests make use of the multiple-choice form to the exclusion of 
most other types. This type of item has all the advantages of the 
other types and minimizes many of the existing limitations. Well- 
written multiple-choice items can assist the teacher to study various 
levels of complexity such as knowledge of fact or principle, appli- 
cation and use of knowledge, insight into basic reasons, and meas- 
urement of understanding. 

The multiple-choice item consists of two major parts: (1) the 
stem, premise, problem, question, or lead; and (2) the answers, al- 
ternatives, distractors, or decoys. 


1. A good source of protein needed for building body tissues is— 


a. rice c. butter 
b. fish , d. sauerkraut 


2. The Schick test is used to detect — 


a. scarlet fever c. diphtheria 
b. measles d. poliomyelitis 


3. “Leaves got up in a coil and hissed, blindly struck at my knee and 
missed” suggests— 


a. a snake image 

b. the force of the wind 

c. the depth of the leaves 

d. that there was a snake hidden in the leaves 


The versatility of the multiple-choice test item has been demon- 
strated by Mosier, Myers, and Price? and the pattern which they 
devised has been used and modified by Adkins,’ Bean,* Micheels 


? Charles L. Mosier, M. C. Myers, and Helen G. Price, "Suggestions for the Con- 
struction of Multiple-Choice Test Items," Educational and Psychological Measure- 
ment, V (Autumn, 1945), pp. 264-67. 

? Dorothy C. Adkins, Construction and Analysis of Achievement Tests (Washing- 
ton: U.S. Government Printing Office, 1947, pp. 52-55. 

* Kenneth L. Bean, Construction of Educational and Personnel Tests (New York: 
McGraw-Hill Book Co., Inc., 1953), pp. 55-59. 


CONSTRUCTING AND USING OBJECTIVE TESTS 123 


and Karnes,’ and others. The basic usefulness of the multiple-choice 
test item is that it can answer questions relating to— 


1. Definition 
a. What means the same as _____? 
b. Which of the following statements expresses this concept in 
different terms? 


A sudden uprising against an established government by the 
people residing within that nation is known as a(n)— 


1. invasion 4. riot 
2. civil war 5. war 
3. revolution 


The symbol x in algebra is generally the same as a(n)— 


1. unknown 3. coefficient 
2. constant 4. integer 


An element that hastens chemical reactions but does not enter 
into the reaction is known as a(n)— 


1. acid 4. instigator 
2. alkaline 5. neutralizer 
3. catalyst 


2. Purpose 
a. What purpose is served by 
b. What principle is exemplified by 
c. Why is this done 
d. What is the most important reason for 


wey 


In an internal combustion engine the spark plug— 


1. Ignites the fuel. 

2. Maintains the proper timing. 
3. Lubricates the engine head. 
4. Adjusts the fuel intake. 


Radium is generally stored in lead receptacles. Why is lead 
used? 


1. It is an inexpensive material. 
2. It prevents radiation of the radium rays. 


5AWilliam J. Micheels and M. Ray Karnes, Measuring Educational Achievement 
(New York: McGraw-Hill Book Co., Inc., 1950), pp. 175-79. 


124 EVALUATING STUDENT PROGRESS 


3. It can be molded easily. f 
4. It reacts with the radium increasing its effectiveness. 


Position is important when playing volleyball. This principle 
is observed when the player— 


1, Spikes the ball from back court. 

2. Catches the ball. 

3. Spikes the ball from a front-line position. 

4. Steps in front of a teammate to save the ball. 


3, Cause 
a. What is the cause of ? 
b. Under which of the following conditions is this true? 


Liquid rises in a siphon because— 


1. Increased air pressure forces the liquid up. 

2. Air pressure forces the liquid up when a partial vacuum is 
created in the tube. 

3. The air pressure is increased in the tube forcing the liquid 
up. 

4. It follows the law of gravity. 


Traffic fatalities increase when— 


. Road conditions are good and the weather is clear. 
- It is twilight regardless of the road conditions. 

- There is ice and snow on the ground. 

- It is night and the roads are good. 

. Road conditions are poor due to rain. 


[Lu 


4. Effect 
a. What is the effect of rd 
b. If this is done, what will happen? 
c. Which of the following should be done (to achieve a given 
purpose) ? G 


In cities the increase of slum areas generally produces— 

1. Increased political activity for higher taxes. 

2. Better educational facilities within the area. 

3. An increased rate of juvenile delinquency. 

4. Efforts on the part of the residents to improve the neigh-, 
borhood. 


CONSTRUCTING AND USING OBJECTIVE TESTS 125 


In algebra when a constant is added to both sides of the 
equation— 


1. The value of the unknown will remain the same. 

2. The value of the unknown will be increased by the value of 
the constant. 

3. The value of the unknown will be decreased by the value of 
the constant. 

4. The value of the unknown cannot be found. 


With one minute left in a basketball game and your team is 
ahead by three points and in possession of the ball, you, as 
team captain, should— 


1. Slow the game down and use deliberate stalling tactics. 

2. Use your fast break and gamble on increasing your lead. 

3. Take time out and let the coach decide. 

4. Try to hold the ball yourself and take your chances of 
being *tied-up." 


5. Association 


What tends to occur in connection with 


When a storm moves into an area the barometer reading will— 


1. drop 3. remain the same 
2. rise 4. react without regard to the storm 


Emotional difficulties are often accompanied by— 


1. Increased classroom effort and a corresponding rise in 
quality of work. 

2. Relatively the same classroom effort. 

3. Decreased classroom effort and a corresponding decrease in 


quality of work. 
4. Relatively the same classroom effort and increasing extra- 


curricular activities. 


6. Recognition of error 


Which of the following constitutes an error (with respect to a 
given situation)? 


126 


EVALUATING STUDENT PROGRESS 
Only one of the following sentences is grammatically correct. 


1, The students leave the class at the end of the period. 
2. In the ensuing brawl the crowd leave the men fight. 
3. For the first time we could leave them talk. 

4. “Leave them alone,” shouted the police officer. 


Which of the following is not characteristic of the principles 
upon which our constitution is based? 


1. Popular sovereignty 3. Unitary system 
2. System of checks and balances 4. Federal system 


7. Identification of error 


a. What kind of error is this? 
b. What is the name of this error? 
c. What recognized principle is violated? 


In the following sentence there is a grammatical error. Identify 
it: When all of the data is collected it will be possible to 
analyze the results. 


1. Use of dangling participle. 

2. Use of a split infinitive. 

3. Use of improper verb form. 

4. Use of improper adverbial clause, 


The square root of 40,000 is solved as follows: \/40000 
2 


What principle is violated in finding the square root of the 
above problem? 


1. Failure to carry out proper division. 

2. Failure to carry out proper multiplication. 
3. Failure to set off the square properly. 

4. Failure to set off the answer properly. 


8. Difference 


What is the (or an) important difference between ? 


The difference between the Spanish and the English as colo- 
nizers in the New World was that— 


1. The Spanish came as conquerors, the English as families to 
establish new homes. 


CONSTRUCTING AND USING OBJECTIVE TESTS 127 


2. The Spanish came because of religious persecution; the 
English came as traders. 


3. The Spanish came because of poverty at home, the English 
as emissaries of His Majesty’s government. 


4. The Spanish came as missionaries, the English as im- 
perialists. 


What is the difference between an adjective and an adverb? 

1. An adverb modifies a noun and an adjective modifies a pro- 
noun. 

2. An adjective modifies a verb and an adverb modifies a noun. 

3. An adverb modifies a verb and an adjective modifies a noun 
or pronoun. 

4. An adverb modifies and an adjective tells how, why, or 
when. 


9. Similarity 
What is the (or an) important similarity between Pate Wo iU Lr 


What is the important similarity between the paramecium and 
the amoeba? 

1. Both have mouths. 

2. Both are one-celled animals. 

3. Both have respiratory systems. 

4. Both are able to reproduce through conjugation. 


What is the important similarity between the writings of 

Emerson and Thoreau? 

1. Both believe in Plato's philosophy. 

2. Both gave the most importance to “nature” in determining 
the possibilities of man. 

3. Both did their best writing in England. 

4. Both sought to establish a concept of government through 
a study of the writings of the ancients. 


10. Arrangement 
In the proper order (to achieve a given purpose or to follow 
a given rule), which of the following comes first (or last or 
follows a given item)? 


128 EVALUATING STUDENT PROGRESS 


Which of the following comes first in the procedure of admit- 
ting a new state to the union? 


1. Writing the state constitution. 

2. Petitioning the Congress for admittance. 

3. Approval of state constitution by the Congress. 
4, Appointing a territorial governor. 


Which of the following steps follows the first in solving an 
algebra problem by substitution? 


Problem: x+y = 3 
3x4-4y = 12 
1. 3x --4 (3 — x) = 12 
2.y-3—x 
3. 3x -3y — 9 
4. 2x - 3y 9 


11. Common principle 
All of the following items except one are related by a common 
principle: 
a. What is the principle? 
b. Which item does not belong? 
€. Which of the following items should be substituted? 


Which of the following is not based upon Civil liberties as de- 
fined in the Constitution: 


1, Women’s suffrage. 
2. Freedom of speech, religion, press. 


3. Equal educational opportunity for all, 
4. Writ of habeas corpus. 


Which one of the following is not related to the others? 


1. biology 3. zoology 
2. botany 4. astrology 


12. Controversial subjects 


Although not everyone agrees on the desirability of ___, 


those who support its desirability do so primarily for the 
reason that— 


CONSTRUCTING AND USING OBJECTIVE TESTS 129 


Those desiring fluorination of public water supplies main- 
tain it would— 


1. Reduce teeth decay. 

2. Add to the purity of the water. 
3. Improve the taste of the water. 
4. Make “hard” water “soft.” 


Advantages and limitations of mutliple-choice test items 


Since the multiple-choice test item is generally regarded as the 
best for measuring most aspects of student academic development, 
it is well to know its limitations as well as its advantages. The ad- 
vantages of the multiple-choice item can be listed as follows: 


1. It is the most flexible of test items and can measure a wide range 
of mental abilities from simple recall to complex aspects of critical 
thinking. It does not ‘follow, however, that all multiple-choice tests 
measure the more complex understandings and skills, for the test 
maker’s ability to construct worth-while items is a major factor 
limiting their value. 

2. It measures the more significant outcomes of education rather than 
memorization of isolated facts. The student must be able to exer- 
cise judgment and discrimination as well as problem-solving ability 
when answering well-prepared multiple-choice items. 

3. It reduces the possibility of chance success by providing several 
plausible alternative responses. 

4. It provides greater test reliability by reducing the element of 
chance or guessing in arriving at the correct responses. 


Since a test represents a man-made technique for measuring a 
sample of student behavior, it is important to understand that even 
the best test items have their limitations. The limitations of mul- 
tiple-choice test items can be summarized as follows: 


1. It is necessary for the test maker to be skilled in writing items that 
will measure the student’s ability to interpret, discriminate, select, 
and evaluate rather than to memorize. Because the multiple-choice 
form is used is no guarantee that good test items will result. It is 
obvious that this limitation can be overcome by teachers if they 
develop the skills necessary for writing good test items. 

2. It should be evident that any paper-and-pencil test is only a sub- 
stitute for more direct means of measurement. Even the best 


130 EVALUATING STUDENT PROGRESS 


written test item can only be used as an indicator of what a stu- 
dent will do in an actual situation. The more realistic the test 
problems, the more the teacher will be able to predict actual stu- 
dent behavior. 

3. It is necessary to recognize the fact that the multiple-choice test 
item cannot measure all forms of behavior. If it is desired to 
measure the student’s ability to draw a picture, a multiple-choice 
test covering art principles will be a poor substitute. 

4. It is necessary for the test maker to be skilled in the mechanics of 
test-item construction if the multiple-choice form is to be used suc- 
cessfully. While good multiple-choice items may be no more diffi- 
cult to construct than any other form of test item, this mastery 
can only be developed after considerable practice. 


Suggestions for writing satisfactory multiple-choice test items 


Multiple-choice test items, just as any other test items, need to 
be written so that they accomplish some rather basic purposes. Satis- 
factory multiple-choice test items should be designed to measure 
the student’s ability to grasp the important educational outcomes 
that are being developed in the classroom. Realistic and prac- 
tical test items should enable the student to make use of the knowl- 
edges and skills that have been developed. Test items that are clear 
and direct, not ambiguous or vague, should be prepared so that the 
Student's responses are valid indicators of what he has learned 
rather than the measure of his ability to guess or cheat. 

Many of the suggestions made for writing satisfactory true-false 
test items are also applicable in this section. A good test item, 
whether true-false, multiple-choice, or essay, is satisfactory because 
the test maker has been able to apply all of the principles of test 
construction. 

1. Write test items that attempt to measure a significant outcome 
of education. Repeatedly the view has been expressed that the edu- 
cational process is not designed to develop “parrots” capable of 
repeating to the teacher what the teacher or the book said. If the 
view is accepted that education is a process designed to encourage the 
use of the highest mental processes held by students, then it be- 
comes necessary to construct test items that will test the student's 
use of the highest mental processes. 

What is significant needs to be determined by the teacher in a 


CONSTRUCTING AND USING OBJECTIVE TESTS 131 


careful manner and the earlier discussion of the use of objectives is 
pertinent to recall at this point. The ability to do and to use is a mat- 
ter of more significance than to recall or to recognize, although there 
are times when a teacher is justified in using recall and recognition 
items because they are elements of the educational objectives that 
are being sought. Because a test item is in the multiple-choice form 
does not automatically make it one that tests higher mental proc- 
esses. 


Poor— 
The Federalist was a series of articles in defense of the new Consti- 
tution written by— 


1, Benjamin Franklin 3. George Washington 
2. Alexander Hamilton 4. Thomas Jefferson 


The Constitution established a federal government according to— 


1. Article IV, section 2 3. Article I, section 8 
2. Article V, section 4 , 4. Article IX, section 3 


Better— 
The principal cause of the failure of the government under the 
Articles of Confederation was— 


1. The limited powers of Congress. 

2. The complete sovereignty of the States. 

3. The inability of the national government to raise money. 
4. The existence of only one house of Congress. 


States in the United States do not have wars with each other as did 
the states under the Articles of Confederation because— 


1. The U.S. Constitution says the states cannot fight and remain in 
the Union. 

2. Under the Constitution the states do not have complete sover- 
eignty. 

3. The original thirteen states paid no attention to their written 
laws. 

4. The president of the United States has the power to forbid the 
states to fight each other. 


In writing test items which are significant and which measure 
varying levels of complexity, it should be apparent that copying 


132 EVALUATING STUDENT PROGRESS 


directly from a textbook will not result in the construction of items 
which seriously challenge student ability to apply what has been 
learned. 

2. Write items at a vocabulary level appropriate to the students. 
One of the major difficulties that beginning teachers have when 
writing test items is to phrase the questions at the vocabulary level 
of the students to be tested. Coming, for the most part, directly 
from college classrooms, the beginning teacher is apt to use a vo- 
cabulary better suited to the college level than the junior or senior 
high school level. Unless the test items that are being written are 
designed to measure vocabulary skills, it is important that they be 
written in a manner that will enable the student to think about the 
question rather than worry about the vocabulary. 

Since the typical junior or senior high school class will consist of 
students with a range of reading ability from the fourth to the 
sixteenth grade, the vocabulary problem is indeed a serious one for 
the teacher to overcome. It is not desirable under these circum- 
stances to write test items at the level either of the poorest student or 
of the best student in the class. The teacher must seek a vocabulary 
middle ground best suited to the group being tested. If the test is to 
be a valid indicator of the student’s competency in social science, 
science, bookkeeping, or mathematics rather than an indicator of 
verbal ability, the teacher cannot sidestep the vocabulary problem. 

There is one other aspect of the vocabulary problem that also 
needs to be considered. While the wording used in writing multiple- 
choice items needs to be at the appropriate language level, it must 
also be in the jargon of the particular field being tested. It is reason- 
able to expect students to understand the vocabulary used in a 
science class if they are taking a course in science. Writing readable 
test items does not mean that the teacher needs to sacrifice the 
technical language important to many fields. 

The following items were Supposedly written for students in a 


junior high school. Note the use of language more appropriate to the 
college level, 


Poor— 


There exist. innumerable methods for enhancing the ability of the 
body to resist communicable diseases. What single factor increases 
this protection? 


CONSTRUCTING AND USING OBJECTIVE TESTS 133 


. Adequate physical exercise. 

Sound adherence to the principles of good nutrition. 
Immunization with an antitoxin as advocated. 

. Exposure to plenty of fresh air. 

. Immunization with appropriate antibodies. 


£n ROO Ron 


Article V of the Constitution of the United States enumerates ex- 
plicitly the modus operandi for— 


1, Initiating changes in the structure of the Constitution. 

2. Securing the recall of congressmen. 

3. Establishing proceedings for the ratification of international 
agreements. 

4. Describing the rights, obligations, and privileges of citizenship. 

5. Delegating inherent powers to the states. 


Better— 
What one factor most aids the body to resist communicable dis- 


eases? 


1. Adequate physical exercise: 

2. Sound adherence to the principles of good nutrition. 
3. Immunization with an antitoxin as advocated. 

4. Exposure to plenty of fresh air. 

5. Immunization with appropriate antibodies. 


Article V of the United States Constitution describes the method 
for— 


1. Amending the Constitution. 

. Securing the recall of congressmen. 

. Approving international agreements. 

. Identifying the meaning of citizenship. 
. Granting powers to the states. 


3. Write each multiple-choice test item to present a clear problem 
or question. Multiple-choice test items should not be merely four 
or five unrelated true-false statements held together by the use of an 
introductory phrase or statement. The unique feature of the multiple- 
choice item is that the stem sets forth a definite problem or ques- 
tion, with the solution to be found among four or five choices. 

Violations of the above generalization are frequently found as 
test makers use clipped phrases to introduce the choices, or write 


aAhwn 


134 EVALUATING STUDENT PROGRESS 


questions that are very incomplete and vague, or use complicated 
sentence structure to introduce questions. The example that follows 
illustrates an item stem that does not identify a problem clearly. 


Poor— 
Beethoven— 


1. Symbolizes the constructive type. 

2. Starts his basic rhythm pattern and then builds upon it. 
3. Makes a definite form for his compositions. 

4. Begins with a musical theme. 


Better— 
Beethoven symbolizes the constructive type because— 


1. He starts his basic rhythm pattern and then builds upon it. 
2. He makes a definite form for his compositions. 

3. Begins with a musical theme. 

4. Introduces a theme at the beginning of each movement. 


4. Write distractors so that the correct answer does not reveal 
itself by clues. The function of a test is not to engage the student 
in a guessing game but to study his grasp of a particular problem or 
concept. To do this the distractors (incorrect responses) must be 
written with a great deal of care. The four or five distractors need 
to be written so that they are— 


1. About the same length as the correct response. 
2. Without obvious clues. 

3. Independent from clues in other questions, 

4. All reasonably plausible. 

5. Randomly scattered from item to item. 


The multiple-choice test item choices are designed to give the 
student the opportunity to select the correct or the best answer from 
among four or five choices. To prevent the multiple-choice item 
from becoming a padded true-false test item, it is necessary to pre- 
pare the correct choice and the alternatives so that they all seem 
logical choices to the question, problem, or premise that has been 
posed in the stem of the item. Persons just learning how to construct 
multiple-choice items will frequently write items so that the answer 
is obvious because it is longer and more detailed than the other 


CONSTRUCTING AND USING OBJECTIVE TESTS 135 


alternatives. The following question is an example of how a clue to 
a question may be found by simply looking at the lengths of the 
choices. 


In the Army Service Forces a grade 1 clerk-typist must— 
1. Be able to type from copy at the rate of 30 net words per minute. 
2. Maintain the confidential files. 


3. Read legal copy. 
4. Take dictation. 


This item could be improved by balancing the length of all the 
choices as is shown below. 


According to the Army Service Forces, a grade 1 clerk-typist must be 
able to type from copy at the rate of— 


1. 30 net words per minute. 3. 40 net words per minute. 
2. 35 net words per minute. 4. 45 net words per minute. 


If the item is to be a strong multiple-choice item, all of the choices 
must be reasonable. Where it is impossible for the test maker to 
write more than two plausible choices, it would seem better to use 
something other than the multiple-choice form. The following ex- 
ample illustrates the use of obviously implausible answers: 


An important qualification for a bus driver is that he have— 


1. A knowledge of history. 

2. Relatives who will ride a bus. 
3. A license to drive a bus or truck. 
4. A good speaking voice. 


To improve the above item the test maker would have to develop 


at least three more plausible answers. One possible way of improving 
the above item would be to prepare it as follows: 


An important qualification for a bus driver is that he have— 


1. No accidents against his record. 
2. No physical deficiencies. 

3. A license to drive a bus or truck. 
4. 20/20 vision without glasses. 


Another common error made by test makers is to include obvious 
clues in the context of one of the choices. This is done by using 


136 EVALUATING STUDENT PROGRESS 


similar words or phrases in the stem of the item and in the correct 
response. This error is most common when test items are constructed 
by taking material directly from textbooks. An example of this error 
of test-item construction follows: 


What is the nature of a direct current according to the electron 
theory? 


1. An exchange of positive and negative charges between each mole- 
cule and the next. 

2. A current of protons in one direction. 

3. A migration of neutrons and protons in one direction. 

4. A flow of electrons in one direction. 


To improve the above item would require some modifications in 
the wording of the alternatives, 


What is the nature of a direct current according to the electron 
theory? 


1, An exchange of positive and negative charges between each mole- 
cule and the next. 

2. A current of protons in one direction. 

3. A migration of neutrons.and protons in one direction. 

4. A flow of negatively charged particles in one direction. 


Sometimes test makers will unconsciously provide clues to the 
correct answers by arranging the correct choices in a particular 
pattern. While this may not be a major problem, it is wise for the 
test maker to provide a random distribution of choices so that a 
correct response pattern is not developed. 

Another concern of the test maker should be the developing of 
items that are independent. Frequently, in attempting to write 
enough test items for a test, the writer will repeat similar items and 
provide obvious clues from one item to another. It is always best 
to carefully review a test to insure independence of items. This does 
not mean that two or more items should not be written to cover the 
same material; it does mean that if two or more items are written 
to cover the same material the premise to one item should not pro- 
vide an answer to another. Neither should the correct answer to one 
item be dependent upon the correct answer to any other item. 


CONSTRUCTING AND USING OBJECTIVE TESTS 137 


MATCHING ITEMS 


Matching items are used by teachers when they want students to 
relate such things as dates and events, terms and definitions, persons 
and places, causes and effects. Micheels and Karnes have said that 
the matching exercise may require the student to match such things 
as— 

a. Terms or words with their definitions. 

b. Characteristics with the mechanical units to which they apply. 

c. Short questions with their answers. 

d. Symbols with their proper names. 

e. Descriptive phrases with other phrases. 

f. Causes with effects. 

g. Principles with situations in which the principles apply. 
h. Parts or mechanical units with their proper names. 

i. Parts with the unit to which they belong. 

j. Problems with their solutions. 

While the matching type of test item is used because it seems to 
be relatively easy to construct and does measure certain types of 
skills, most test experts agree that the chief disadvantage of match- 
ing is that it is not very well adapted to the measurement of real 
understanding as distinguished from rote memory. It is also said 
that irrelevant clues will be present to a much greater extent than 
in other forms of test items. Generally, it is recommended that 
multiple-choice items be used whenever possible in place of matching 
items. 


Directions: The following list contains names of things which were— 


A. Before the time of Silas Marner. 

B. During the time of Silas Marner. 

C. After the time of Silas Marner. 
Before each item in the list write the letter which identifies its time 
placement. 


1. spinning wheels 6. squires 

2. gladiatorial shows 7. knights 

3. horseless carriages 8. itinerant weavers 
4. manor houses 9. steamships 

5. tenant farmers 


ê Micheels and Karnes, op. cit., p. 233. 


138 


EVALUATING STUDENT PROGRESS 


Directions: In the lefthand column is given a list of definitions; in the 
righthand column is given a list of words taken from the text of Silas 
Marner. Before the definition in the lefthand column place the identify- 
ing letter of the word in the righthand column which it correctly defines. 


A. 

1. Dependent upon something that ^ A. admonition 
may or may not occur. B. concession 

2. An excessive desire, especially for ^ C. contingent 
wealth. D. cupidity 

3. Quickness, skill and ease in phys- E. dexterity 
ical activity, especially in using F. husbandry 
the hands. G. meager 

4. Lacks richness, strength, etc.; H. metamorphosis 
barren, scanty. I. query 

5. Change of form, structure or sub- J. repugnance 
stance, especially by witchcraftor K. subsequent 
magic, L. tacit 

6. Implied or indicated, but not ac- 
tually expressed. A 


7. Inquire into; ask about. 


In their analysis of the matching item, Micheels and Karnes indi- 
cate the limitations of the matching item but offer the following 
Suggestions for its construction: 


1, 


2. 


1 
11. 


Have at least five and not more than twelve responses in each 
matching exercise, 

Include at least three extra choices from which responses must 
be chosen. 

Use only homogeneous or related materials in any one exercise. 
Include at least three plausible choices from which the correct 
response must be selected. 


Place the column containing the longer statements on the left 
side of the page. 

Use illustrations whenever possible. 

Arrange the selection column in logical order. 

Keep the student in mind as the item is prepared. 

Make sure the entire exercise appears on one page. 

Make the directions Specific. 

Use capital letters to label the parts in the column from which 
the responses are to be selected." 


7 Ibid., pp. 234-41, 


CONSTRUCTING AND USING OBJECTIVE TESTS 139 


SUMMARY 
The major suggestions for writing test items can be summarized 
as follows: 

1, Write items that test the student’s ability to apply what he has 
learned. 

2. Write items in a direct and clear style at the vocabulary level of 
the students being tested. 

3. Write items that stress main points rather than trivial details. 

4. Write items in correct grammatical form. 

5. Write items appropriate to the objectives being evaluated. 

6. Write items which are technically correct. 


CHAPTER 
8 


Construction and Use of Essay 
and Short-Answer Tests 


SINCE THE EARLIEST Days of the objective testing movement, a con- 
troversy has continued over the relative merits of essay and objective 
examinations. Many advocates of objective methods have main- 
tained that the essay exam, while-useful in its day, is now out of 
date and should be abandoned in favor of the “more reliable, efficient 
and comprehensive” objective methods of measurement. Defenders 
of the essay technique, on the other hand, have criticized objective 
test methods as being “anti-intellectualism in the extreme.” They 
contend that merely checking responses on an answer sheet is the 
best way to teach students not to think. 

Most evaluation specialists today will agree that the essay-ob- 
jective test Controversy is not an either-or proposition. Both of the 
techniques, when Properly used, can and should be valuable tools 
for the teacher to use in evaluating the many different behaviors 
which educational objectives define. For many outcomes, especially 
those dealing with knowledges and understandings, objective test 
methods appear to be the most efficient and best methods of measure- 
ment. For certain Purposes, objective tests are singularly inap- 
propriate. If, for example, a teacher wants to see how well students 
can select and bring together pertinent evidence to support a plan 
of action, or how they can plan a comparative evaluation of several 
alternative Proposals, an essay question might be more valid than 
any objective test. Certainly where originality of thought, creative 
writing, or any related outcomes that deal with student ability to 


140 


ESSAY AND SHORT-ANSWER TESTS 141 


organize and express conflicting or original ideas about suggested 
topics are concerned, there is little doubt but that the essay test is the 
most appropriate technique the teacher can use. 

One aspect of the basic philosophy of measurement accepted by 
the authors of this book is that the kind of evaluation technique to 
be employed in any particular situation will be determined by the 


-objectives involved. If the most valid measure can be obtained by 


an objective test, then this technique should be used; if the objective 
describes a behavior most validly measured by a rating scale or 
checklist, then these tools should be used, even though such tech- 
niques are admittedly less reliable and efficient than are objective 
tests. The great variety of behavioral objectives defined in schools 
today demands that teachers have at their command a great variety 
of evaluation techniques, and that they know when and how each 


. may best be used. No one technique, be it the essay test, the true- 


false test, pupil observation, or the sociogram, is appropriate for 
all purposes. 


ESSAY TESTS 


Some of the specific criticisms of essay tests and the implications 
of these criticisms follow: 


1. Essay tests are an inefficient method of measurement. It is often 
pointed out that if a teacher has thirty minutes available for test- 
ing over a certain unit or area of instruction in which there are 
perhaps twelve major topics, the use of an essay test will permit 
writing on perhaps only a few of these topics because of time 
limitations. With the same amount of time, a thirty- to fifty-item 
multiple-choice test covering several aspects of each of the twelve 
topics could be administered. Thus, the multiple-choice test is much 
more efficient than the essay test and permits a more comprehensive 
measurement. 

This criticism is valid if the content or topics involved are readily 
adaptable to objective testing. The selection of an essay test in this 
case would be an outright misuse of the technique. For most ob- 
jectives dealing with knowledge of fact or principle, with applica- 
tions of principle or interpretation of data, the most valid measure 
would be an objective test item. One of the least valid would be the 


142 EVALUATING STUDENT PROGRESS 


essay test, since it was simply not designed for this type of measure- 
ment. 

That this misuse of the essay test is a very common one is shown 
in several analyses of essay questions included in school testing 
programs. Up to 80 per cent of the essay questions in one study 
were designed to measure knowledges—outcomes that could be 
measured much more reliably, comprehensively, and efficiently with 
objective test items, 

2. Essay tests cannot be reliably graded. Proof to support this 
criticism is often found in research findings that indicate little 
uniformity exists among raters in assigning grades to a single 
paper. One of the earliest studies, for example, showed grades rang- 
ing from 28 to 92 (100 being a perfect score) assigned a single 
geometry paper by a jury of 115 teachers. Along with reports of such 
research findings, mention is usually made of the high reliability 
coefficients found for practically all of the “common standardized 
objective tests.” 

While, as a general rule, essay tests will be statistically less re- 
liable than a good objective test, such comparisons as the above tend 
to show the essay test in a false light. If great care is used in stating 
the essay question, and if equal care is used in grading the responses 
(following a procedure such as will be described later), there is good 
reason to believe that close agreement can be reached by separate 
raters or by the same rater on different occasions. In a study con- 
ducted by one of the authors, for example, two instructors inde- 
pendently graded a set of 54 papers using a well-developed scoring 
card. The two sets of grades were correlated and an r of .83 was 
found. In another similar study, this one involving over 450 student 
papers, each paper was independently read by at least two raters 
using a complete and objectified Scoring plan. Of the pairs of 
grades assigned, 64 per cent were identical letter grades, and 95 
per cent were at the most deviations of one letter grade (a B and 
a C, or an F and a D assigned a single paper). On only three of the 
450 papers did the two ratings deviate by two letter grades, an A and 
a D having been assigned to each paper by these raters. 

Essay tests can be graded with a reliability approaching that of 
objective tests. It should be noted here that most teacher-made ob- 
jective tests do not achieve the high levels of reliability commonly 


ESSAY AND SHORT-ANSWER TESTS 143 


associated with standardized tests. The majority of teacher-made 
tests would probably have reliability coefficients of about .80 to .85 
instead of the minimum level of .90 usually demanded of a good 
standardized test. 

Essay tests can and should be reliably graded, but this demands 
time, skill, and patience on the part of the teacher doing the grading. 
If teachers are unwilling to put forth the necessary effort, the limi- 
tations or weaknesses reside within the individual rather than in the 
technique itself. 

One further remark concerning the relative reliability of grad- 
ing an essay and an objective test should be made. If it can be de- 
cided that the essay and the objective test are equally valid for the 
measurement of a particular objective, the objective test should 
probably be selected because of its probable high reliability and 
because it permits of a more comprehensive measurement. If, how- 
ever, the objective to be measured is of such a nature that the essay 
is a more valid measure, the superior reliability and comprehensive- 
ness of the objective test must be sacrificed and the essay employed 
in its place. The most basic characteristic of any technique of 
measurement is its validity, and all decisions relative to the use or 
nonuse of the technique must be made upon the basis of this 
factor. 

3. Essay tests encourage bluffing. Most teachers have had experi- 
ence with the student who is “short on knowledge but long on 
expression,” whose essay papers suggest at least face validity for 
the above criticism. That there are students who can “say nothing 
extremely well,” and who are rewarded with high grades for such 
accomplishment, is more a reflection on the teachers who assign the 
grades than it is a criticism of the essay technique. The essay test, if 
loosely graded, does provide an opportunity for bluffing, but it is the 
teacher who has the control over the essay and who decides how it 
shall be used and graded and who thus can control bluffing. 

As will be explained in detail later in this chapter, grading of an 
essay examination should involve a comparison of each student’s 
response against a scoring key that describes in detail those points 
or qualities that must be supplied to qualify the response as ac- 
ceptable. This key, carefully prepared in advance of the actual 
scoring and checked against a sampling of student papers to insure 


144 EVALUATING STUDENT PROGRESS 


comprehensiveness, lists the specific responses called for in the 
question. If a paper, well written in the sense that it reads impres- 
sively or well, nevertheless does not include in it the points or re- 
sponses described in the key, there are clearly no grounds for a 
teacher assigning even a passing grade. Unless clever writing is the 
objective of the essay test, it should not be the sole criterion for the 
grade. 

4. Grading of essay tests is too often influenced by extraneous 
factors such as spelling and handwriting. This criticism, in common 
with so many of the others, is basically related to faulty practices 
in grading. Essay test critics argue, and with considerable validity 
in many cases, that grades on essays often reflect not only an evalu- 
ation of what was written but a conscious or unconscious bias re- 
sulting from errors in spelling and grammar, and from poor hand- 
writing. 

Anyone who has graded essay examinations knows how difficult it 
is to separate what is written from how it is written. A paper full 
of glaring errors in spelling and grammar, and written in a scrawl 
that at times almost defies translation, frequently is given a lower 
grade than would be given a well written and “correct” paper, even 
though the actual ideas and content of the two papers might be 
equal. This possibility of bias does exist and should be recognized. 
But while it is improbable that the average rater can remain 100 
per cent objective with regard to a very poorly written paper, a 
clear and prior determination of what is to be included in the grade , 
can help to eliminate most of this bias. 

The teacher must make several decisions based upon the ob- 
jectives that have been defined for the course or unit. The first and 
most basic decision is whether or not any conscious attention should 
be given to mechanics and/or effectiveness of expression. Some 
teachers in subject areas other than English feel that grading of 
mechanics and effectiveness of expression is the sole responsibility of 
the English teacher and that the teacher of physics, for example, 
should grade solely upon competence within the specialized subject 
matter. If the paper in question is to be graded on English usage, 
they say, it should be done by the English teacher and this grade 
should be kept entirely separate from any evaluation of subject 
matter competence. 


ESSAY AND SHORT-ANSWER TESTS 145 


Other teachers have broader objectives of instruction, it being 
entirely in order for a physics teacher to feel that good handwriting 
and correctness of expression are as fundamental as is knowledge of 
subject matter. Many teachers thus view student achievement as a 
totality, and feel that intensive knowledge in a departmentalized 
subject area is not enough. They would argue that the student must 
be able to use his knowledge and communicate it to others, and that 
this is impossible if he neglects the fundamentals or even the 
“niceties” of written expression. In this concept, writing competence 
is a valid outcome of teaching physics and any paper submitted to 
the physics teacher should be judged not only on what was written 
but ow it was written. According to this philosophy, in which the 
teacher is more than a dispenser of subject matter, every teacher ac- 
cepts responsibility for helping students to develop and maintain 
writing skills, and also accepts responsibility for grading essay papers 
for mechanics of expression. 

The question of how much conscious attention should be paid to 
writing skills by the classroom teacher can thus be decided in terms 
of what the teacher accepts or defines as instructional objectives. 

The second decision to be made, if one accepts the view that “every 
teacher is an English teacher,” has to do with the question of how 
the subject matter essay should be graded. The two main alternatives 
are to assign separate grades for content and for presentation, or to 
assign a single weighted grade. The authors believe that the former 
alternative is more desirable, and that as many separate evaluations 
should be made as there are major objectives to be evaluated. This 
would mean, for example, that a single essay question might have 
two, three, or even more separate grades assigned to it, the grades 
representing evaluations of subject matter mastery, grammatical 
correctness, spelling, and correct word usage. 

It should be understood that adequate criteria would be defined 
for each evaluation to be made to insure that the grades assigned 
would be as valid and reliable as possible. 

5. The usability or diagnostic value of essays is nil. The last major 
criticism of the essay examination is its supposedly limited value 
as a teaching device. It has been said that the single grade assigned 
an essay, or the several grades assigned, if multiple objectives are 
evaluated, is not specific or diagnostic enough to satisfy the 


146 EVALUATING STUDENT PROGRESS 


criterion of usability. This criticism is justified if the specific 
Strengths or faults in an essay examination are not made known to 
the students. This kind of interpretation does take time and effort 
on the part of the teacher. It is more complex and time consuming 
than is the simple process of counting correct responses and report- 
ing that Johnny missed 30 per cent of his spelling words on a true- 
false test or 5 per cent of the problems on an algebra test. 

If an essay test is given, the student should know why he re- 
ceived the grade he did. If he only mentioned two out of the ten 
points that the teacher had identified as essential for a correct 
answer, if his spelling of certain words was incorrect, or if his at- 
tempts to organize his ideas were poor, separate and complete com- 
ments defining these deficiencies should be noted on his paper when 
it is returned. In this way, the student can learn why he received the 
grade he did and can be helped to correct the deficiencies the teacher 
noted, 

The diagnostic value of an essay test is limited only by the amount 
of time and effort the teacher wants to spend in pointing out the 
nature and location of errors. 

Other criticisms of the essay test deal more with the misuses of 
the technique than with its basic characteristics. It must be recog- 
nized that the essay item does have faults. It is generally less re- 
liable than comparable objective type tests; it takes a great deal 
of the student's time to write the answer; it demands much more of 
the teacher's time to grade adequately; it may lack some validity in 
à comprehensive field of study because of the limited sampling of 
objectives that can be covered in an examination period ; and, despite 
the most careful precautions a teacher can take, the likelihood of 
personal bias affecting the grading is great. The essay is clearly not 
the perfect technique, but, for those objectives that call for this kind 
of measurement, it is the best and most valid measure available and 
should thus be included in the array of evaluation techniques em- 
ployed by the good teacher. 


When to use the essay test 


In setting forth general rules for the use of the essay, one “nega- 
tive” rule and one “limiting” rule must first be considered. The 
negative rule suggests that the essay should not be used in any spe- 


ESSAY AND SHORT-ANSWER TESTS 147 


cific case if a more objective measure is equally applicable. In other 
words, if student progress or status with respect to a particular ob- 
jective can be measured with equal validity, either an essay or an 
objective test, the latter should be used. This, of course, is because 
of the admittedly greater reliability and efficiency that can usually 
be expected through the use of objective measurement. 

The limiting rule is that the decision to use an essay examination 
obligates the teacher to use it properly. The value of any essay test is 
limited mainly by the degree to which teachers are willing to take 
the time and effort to use the technique properly. 

Essay questions have been found to be most useful with respect 
to: (1) objectives stressing students’ ability to draw upon and to 
organize, integrate, and/or evaluate their store of knowledge and ex- 
perience; (2) objectives dealing with creative writing or originality 
of expression; (3) objectives specifying actual writing competence, 
such as one might find in English or journalism classes; and (4) ob- 
jectives having to do with application and interpretation of facts and 
principles. 

For each of these classifications of objectives, considerably more 
is demanded from the student than can ordinarily be measured 
very adequately by objective means. In each case, a direct form of 
measurement is intended, such as having a student in a Business 
English class write a letter of application. 


Constructing and grading essay examinations 


Despite the fact that many teachers believe essay examinations 
are easier to use than objective tests, the preparation and scoring 
of good essay examinations is a fairly complex operation. If an 
instructor decides to use an essay examination, he should give care- 
ful attention to the writing of such questions and to procedures 
which will insure reliable and objective grading. 


Suggestions for constructing essay examinations 

1. Draft the essay question carefully, defining important direc- 
tional words to eliminate semantic difficulties. For example, you 
may want to say: “By ‘discuss,’ I mean to give reasonable facts and 
State your interpretations; by ‘compare’ I mean give a full answer 
in which you consider like and unlike factors in the situations; 


148 EVALUATING STUDENT PROGRESS 


‘contrast’ calls for a consideration of the ways in which the factors 
differ,” and so on. Each key word should be defined, for each calls 
for a different approach in the writing of such an examination. 

2. Phrase the questions so as to give hints concerning the struc- 
ture of the answer expected, unless this is inconsistent with the ob- 
jective to be measured. Thus the question, “Discuss the Articles of 
Confederation,” becomes more tangible both for student and grader 
if phrased: “Discuss the Articles of Confederation with respect to 
their origin, their working out in practice and their relationship to 
the present federal Constitution.” Students are less likely to ramble 
and bases for comparison in grading will be more uniform with the 
common set of reference points. This method is superior to a plain 
“discuss” question because it identifies for all the elements which 
are to be treated. It assures that no pupil will neglect important 
aspects due to mere oversight. It also simplifies scoring by assuring 
that the pupil’s responses will be found sufficiently isloated to per- 
mit a separate reckoning of each. 

3. Write questions that can readily be answered within the time 
allowed for the examination. Plan the steps a student would ordi- 
narily go through in answering the question and try to estimate 
carefully how much total time is needed to do an acceptable piece 
of work. Allow enough time for students to outline, write, and read 
through their answers. Nothing is more frustrating to students than 
to be allowed fifteen to twenty minutes to answer a question that 
easily demands thirty-five to forty-five minutes of thought and writ- 
ing to be adequately covered. Suggest a time allowance for each 
question if more than one essay question is included in the examina- 
tion. 

4. Construct the question or questions so that they are of such a 
range of difficulty as to allow all students to demonstrate their level 
of competence. If a single question is used for the examination, its 
level of difficulty should be such that students with only limited 
competence may demonstrate what they can do. Grades for students 
who show higher levels of competence should then be assigned with 
regard to the degree to which they go beyond the minimal level of 
response expected. If several questions are included in the examina- 
tion, the questions should vary in difficulty, ranging from one topic 
or problem which all students should be able to handle successfully 


ESSAY AND SHORT-ANSWER TESTS 149 


through a topic or problem which only the more competent students 
would be able to deal with adequately. i 

5. Require all students to take the same examination, avoiding 
choices among several questions. If the assigned topic or problem 
represents an important objective of the course, a necessary charac- 
teristic of any question used, the progress of all students toward 


: this goal should be measured. Second, allowing a choice of questions 


introduces very complex problems, such as whether the various op- 
tions are of equal difficulty, whether acceptable answers actually show 
equal attainment of the objective being tested, whether the time 
required for each choice is equal, and so on. Unless optional ques- 
tions can be thus equated, and experience shows that this can seldom 
be done, it is wiser to require all students to answer the same ques- 
tions. 

With proper care used in the preparation of essay questions, valid 
information concerning the extent to which students attain the ob- 
jectives of instruction may be obtained. However, the worth of such 
information for purposes of grading student progress is nil unless 
Such exams are reliably scored. 

Educational journals contain many references to studies showing 
the wide variation in grades assigned to a single essay answer by 
groups of teachers or by single teachers scoring the exam at differ- 
ent times. Assignment of grades ranging from A through D or even 
F to a single essay are not uncommon findings. However, other re- 
search findings show that very reliable grades can be assigned to 
essay questions if a plan of objectifying grading procedures is 
worked out and carefully followed. Such objectified procedures can- 
not only increase the reliability of grades assigned, but can also 
facilitate the diagnostic use of such examinations. 


Suggestions for grading essay examinations 

1. When grading papers, try to minimize the personal element by 
keeping student responses as anonymous as possible until grades 
have been decided upon. If students write on one side of the page 
only, have them sign their examinations on the reverse side of their 
last page. If responses are written in “blue books” and are signed 
on the front cover, go through and fold back the covers on all the 
booklets before beginning to grade any of the papers. It is then up 


150 EVALUATING STUDENT PROGRESS 


to the instructor to resist the temptation to peek at the signature 
before or during grading. 

2. If the examination includes more than one question, grade just 
one question at a time. Grading essay questions against. objective 
criteria can be more effectively accomplished if the same question 
on each paper is graded before going on to the next question. Com- 
parisons of student answers to any particular question and grading 
against specific points to be covered in an acceptable answer tend 
to become confused and unreliable if the instructor attempts to grade 
more than one question at a time. 

3. Decide prior to grading what should be allowed to score in 
each question and what specific points should be covered in an ac- 
ceptable answer. Write out a table of specifications listing the spe- 
cific points that should be covered in answering the question and 
the amount each specific should count toward the grade. Grades 
for each question can then be assigned on the basis of the points 
covered in the answer. (To aid students to understand where their 
papers are inadequate, notations -should be made on the answer 
sheet indicating specifics either not covered or inadequately treated.) 
Such a set of specifications assumes that if more than one question 
is included in the examination, the several questions will be weighted 
in accordance with the importance and difficulty of the question. 
Before using the set of specifications listing individual points to be 
covered, test it out by reading through a sample of answers and then 
add to, or modify, the listing to insure that the first paper will be 
graded against the same criteria as the last one. 

4. Make a special separate provision for consideration of sentence 
Structure, spelling, and so on. These factors should not be permitted 
to affect the score on an essay which is primarily concerned with 
the instructional objectives, unless the mechanics of expression are 
such that the intent of the answer cannot be determined. In such 
a case, a grade may well be withheld until the student’s command of 
the mechanics of expression is such that his answer may be clearly 
understood. Conscious effort is required to avoid a favorable preju- 
dice toward a student's response through such considerations as 
neat handwriting, extensive general vocabulary and fine prose, and 
to avoid an ‘unfavorable prejudice toward students’ responses 
which are characterized by faulty mechanics, Strictly speaking, in 


ESSAY AND SHORT-ANSWER TESTS 151 


courses in which such communications skills are not major ob- 
jectives, a separate grade should be assigned in addition to those 
dealing with achievement toward major goals, which will indicate 
effective or ineffective use of the mechanics of expression. 

5. Test the reliability of your own grading procedures by regrad- 
ing a group of papers some time after the original grading. Remove 
the grades originally assigned or hide them carefully from view. 
Then, using the same set of specifications as in the original grading, 
regrade the papers after two or three weeks have passed. If less than 
90 to 95 per cent of the grades assigned on these two independent 
trials do not agree, your grading procedures need to be revised to 
insure a higher degree of reliability. 

6. Use double grading (two instructors grading the same papers 
independently) wherever possible, agreeing on grading standards and 
Specifications in advance. The practice of having two instructors 
grade essay questions independently is especially advisable for ex- 
aminations which are heavily weighted in the assignment of final 
grades for students. In the case of-disagreements, a conference should 
help lead to an understanding of the differences and mutually 
accepted grades can then be determined. With any number of dis- 
agreements (more than five to ten per cent of the total), the grading 
specifications should be examined to determine wherein they are 
inadequate, and the papers then rescored using the revised grading 
Specifications. 


SHORT-ANSWER ITEMS 


The short-answer item is a free-response form. The test taker is 
called upon to write an answer to a question rather than to circle 
a number or letter as is required in objective items. The short- 
answer item, however, is much more restrictive than the essay, for 
the test taker is forced to answer the question within very prescribed 
limits. Several examples of the various short-answer forms will 
illustrate the uses commonly made of this kind of item. 


Rickets are caused by vitamin (1) deficiency. 


Silas Marner had lived in Raveloe (2) years 

when the story opened. 

(3) See TUE Books are listed three ways in a card catalogue. 
These areby (3) ,by (4) ,andby (5) . 


ADM aa 


(2) age SE 


152 EVALUATING STUDENT PROGRESS 


KA EIS 12 y 

MP 

How many senators represent each state in Con- 
gress? (6) 


(6) stum 


(7) The factors of a?-|- 5x-- 6 are (7) 
(8) What organization is the N.A.M.? — (8) 
Write the type of figure each of the following draw- 
ings represents: 
a aa 


(up <b 


(EZ 90) S90 What are three primary colors? 
(14) 


(12) — (13) 
(s) EE ME 


(14) 


There are many variations of the short-answer item, but basically 
most of them call for the recall of specific information that can be 
placed in a blank or a series of blanks, When the student is expected 
to give an extended answer to a short-answer item, it would probably 
be better to reconstruct it as an essay item. While the short-answer 
item is considered of somewhat more limited use than the multiple- 
choice type of item for Measuring the recognition and recall of 
factual information, it does have uses in specialized areas where the 
recall of specific bits of information is considered essential. 

Kenneth L. Bean indicates the following advantages and disad- 
vantages of the completion, or short-answer, test item: 


Among advantages may be mentioned that recall is demanded instead 
of mere recognition. Study habits must be thorough, therefore, in order 
for recall to take place, Guessing, which is likely on true-false and pos- 
sible on multiple-choice material, is reduced to the very minimum, On 
the other hand, there are limitations that deserve mention. Unless they 
are very carefully constructed, completion items are likely to require no 
mental processes higher than rote memory. They measure detailed 
factual information largely, not thinking or organizing ability. There are 


ESSAY AND SHORT-ANSWER TESTS 153 


some rather subtle difficulties in construction, which, if not observed and 
resolved, will lead to lack of objectivity in scoring. Grading is not nearly 
so rapid as it is for other types of objective questions, and may often 
call for expert judgment rather than clerical help, since acceptable an- 
swers not included by the writer in the scoring key are apt to occur. 
Objectivity may often be much less than in other types of so-called **ob- 
jective" test items. In this respect, completion material falls somewhere 
between the completely objective and the essay forms. . . . The length 
of answer, variety of possible answers, and factual or theoretical nature 
of the subject matter will all enter into the relative objectivity or sub- 
jectivity of scoring. Most statements with an omission or two can be 
changed around into the form of a direct question calling for a. brief 
answer. . . . Many standardized tests of achievement and aptitude 
have employed completion forms of one sort or another with successful 
results.! 


Robert Ebel, writing in Educational Measurement, offers the fol- 
lowing suggestions to individuals using the short-answer form of test 
item: 


1, Use the short-answer form only for questions that can be answered 
by a unique word, phrase, or number. 

. . . The implication of this restriction is that a test composed exclu- 
sively of short-answer items is almost certain to overemphasize vocabu- 
lary. It is probably safe to observe that written tests in general, both 
essay and objective, have always placed too great a premium on vocabu- 
lary and too little upon other important aspects of achievement. 

2. Do not borrow statements verbatim from context and attempt to 
use them as short-answer items. 

Ambiguity of the item and perplexing variations in the answers are 
almost certain to result from this procedure. 

3. Make the question, or the directions, explicit. 

Avoid such indefinite questions as 


a. Who was George Washington? 
b. Where did Columbus land? 


In computational problems, specify the degree of precision expected 
and indicate whether or not units of measurement must accompany nu- 
merical answers. 


* Kenneth L. Bean, Construction of Educational and Personnel Tests (New York: 
McGraw-Hill Book Co., Inc., 1953), pp. 75-76. 


154 EVALUATING STUDENT PROGRESS 


4. Allow sufficient space for pupil answers, and arrange the Spaces . 
for convenience in scoring. 

It is frequently convenient to have all the blanks in a single column 
> at either the left or the right margin of the examination paper. 

5. In computational problems specify the degree of precision expected, 
or, better still, arrange the problems to come out even except where 
ability to handle fractions and decimals is one of the points being tested. 

Tf the correctness of a numerical response depends upon stating the 
unit of measurement, make this fact clear. If not, it is best to include 
the unit of measurement in the statement of the question as, for example, 


c. The volume of a cube nine feet on an edge is ______ cubic feet. 


6. Avoid overmutilation of completion exercises, 
An extreme example of this may be observed in the following sample 
item. 


A hay ______ affords another. — of the —  — existing 
between. — and 


A student with a good memory who had encountered this statement 
before, might be able to puzzle it out and successfully fill the blanks 
with the words “infusion,” “illustration,” “relationship,” “animals,” and 
“plants.” But it is obvious that far too many words have been removed 
to permit the item to pose a clear-cut problem. Even an expert biologist 
would find the item troublesome, and he would certainly brand it as 
trivial (unless he wrote it himself) .? 


SUMMARY 


The teacher, confronted with the necessity of using and con- 
structing various forms of test items, needs to carefully consider the — 
uses, advantages, and limitations of each of the forms. The funda- 
mental consideration should be: “How well will this particular item 
Measure the progress the student is making toward the goals of in- 
struction?” If any group of items brought together in a test actually 
does assist the teacher to improve the teaching-learning situation, 
then whatever technical disadvantages exist can be minimized al- 


? Robert L. Ebel, “Writing the Test Item,” Educational Measurement (Washing- 
ton, D.C.: American Council on Education, 1951), pp. 227-28. 


ESSAY AND SHORT-ANSWER TESTS 155 


though not ignored. It is a professional responsibility of the teacher 
to learn how to use effectively and efficiently the tools of the profes- 
sion. Tests represent one of these tools, and it is absolutely necessary 
for teachers to learn how to construct them, administer them, score 
them, and apply the results wisely. 


CHAPTER 
9 


Checklists, Rating Scales, Inventories, 
and Questionnaires 


Ir 1s FUNDAMENTAL in the Philosophy of evaluation that all, or at 
least as many as possible, of the characteristics of students be ap- 
praised. It is only through an accurate and complete picture of youth 
that the school can hope to teach him effectively, Tests and the re- 
sults of tests provide only a part of the total picture, that part 
chiefly concerned with what the student knows. Thus, there remain 
several important voids in the teacher’s knowledge of the student— 
his personal and social adaptability (behavior); his performance 
in skill areas; the quality of product which he produces; and mis- 
cellaneous attributes and characteristics, such as his ideals, interests, 
attitudes, work habits, health, and ways of thinking. These voids 
must and can be filled with information obtained largely through 
the use of checklists, rating scales, inventories, questionnaires, and 
other informal devices. 


CHECKLISTS 


A checklist may take one of several forms and be used for several 
different purposes. It may be— 


1, A list of steps in Performing a certain operation used by an ob- 
Server in evaluating a student's Proficiency in some skill. 

2. A list of activities or characteristics that are checked off if they 
exist and are left blank if they do not. 

3. A list of goals or objectives that are given check marks as they are 
achieved. 


156 


CHECKLISTS, RATING SCALES, INVENTORIES 157 


4. A list of topics or assignments that are checked off as they are 
covered by a teacher or student in a unit or segment of a course. 


One of the simplest forms of checklist is the grocery or shopping 
list. It is obvious that in the checklist there are no halfway re- 
sponses. In the case of the grocery list, the item either is purchased 
or it isn’t. In its various forms, the checklist is chiefly useful for 
“yes-no” or “either-or” situations. Its major function is to focus 
attention on the items covered rather than on the relative importance 
of the individual items. 

A checklist, Figure 6, is most often used in recording the pro- 
cedures used by a student in the performance of some complex task 
or skill. It is useful in laboratory sciences, industrial arts, home 
economics, athletics, forensics, dramatics, music, and other perform- 
ance areas in which certain tasks are most effectively performed by 
the use of specific sequences and techniques taught to the student. 
In constructing such a device, a list of actions, both appropriate and 
inappropriate, is prepared in advance. The teacher observes each 
pupil separately as he attempts to perform the assigned task and 
keeps a running account (by checking the appropriate items on the 
checklist) of his procedures, both good and bad, in order of oc- 
currence. The student’s performance is then evaluated in terms of 
how closely it compares with the procedures or actions considered 
best for reaching the desired goal. 

In the construction of the checklist it is essential that the indi- 
vidual items of the list be defined or stated in specific and, if pos- 
sible, functional terms rather than in general terms, for example: 


Puts undue amount of “spin” on ball when shooting 
Unable to work without specific directions 

Usually walks home alone 

Stammers when appearing before an audience 

Voice is well modulated and melodious 

Holds clarinet at correct angle to body 


Although each of the above items is taken from a different type 
of checklist, all of them are functionally defined and can be checked 
as either existing or not existing. 

Tn the construction of a checklist to evaluate performance, the first 
step always is a careful “job” analysis of the skill or task to be 


PHYSICAL EDUCATION CHECKLIST 


Name 
Date 


. SPORTSMANSHIP 


a. Is a good winner 

b. Is a good loser 

C. Accepts decisions without complaint 
d. Plays his best at all times 

€. Isa good team player 


. Group RELATIONSHIP 


a. Advances ideas to which the group pays attention 
b. Is willing to accept suggestions 


—— c. Accepts leadership of others 

d. Is attentive during class discussions 
— e. Follows instructions 

f. Follows group decisions 


g. Waits his turn to talk with others 


. SELF-DIRECTION 


a. Is prompt in getting to class 

b. Is prompt in departing for next class 

C. Can be depended upon to complete a job 
. ENTHUSIASM 

a. Shows interest in every class 

b. Requests help in improving skill 

c. Offers suggestions for class activities 

- PERSONAL HYGIENE AND SAFETY 

a. Wears clean gym clothing 

b. Wears regulation gym clothing 

C. Takes showers 

—— d. Wears shower clogs 

€. Observes personal safety measures 

f. Conducts himself so as not to endanger others 


- CARE oF EQUIPMENT 

a. Uses equipment properly 

b. Returns equipment when finished with it 
c. Has identification on all gym clothing 

d. Puts personal equipment into locker 


Fig. 6. A typical checklist 
158 


CHECKLISTS, RATING SCALES, INVENTORIES 159 


observed. It is important that each essential component of the skill 
be included in the list and, if necessary, that they be in proper order 
or sequence. 

Checklists can readily be made up by the classroom teacher for 
many purposes. The possibilities are almost endless, but a few 
illustrations might suggest the wide variety of uses: 


iB 


Health habits list. 


2. List of books to be read. 


3. 


4. 


10. 


11. 


Evaluation of books, pamphlets, tests, periodicals, maps, and 
globes in terms of specific criteria. 

Teachers’ self-evaluation checklist. 

Student’s self-analysis checklist to help him know himself from 
the standpoint of his strong points and his weak points. 
Behavior problem checklists to help in solving school discipline 
problems. 

“Personal error charts" in which the student checks his errors in 
such things as spelling, capitalization, diction, pronunciation, sen- 
tence structure, and punctuation. 

Chart similar to Figure 7 kept by the teacher for each student. 
“Practice” checklist on which the student places a check mark 
for each day he practices. This might be done for any activity, 
such as music lessons, reading, or studying, which should be prac- 
ticed or engaged in every day. Although the check mark does not 
indicate the extensiveness of the practice, it does provide some- 
what of an incentive to develop regular practice habits. 

“Topics” checklist prepared by the teacher listing some fifty to 
one hundred topics related to a course which have a bearing on 
the student and his future development. Students check the ten 
or fifteen topics of most interest to them. The items checked by 
the greatest number of students then serve as a guide to the 
teacher in developing course units. 

In the area of music, a popularity checklist has proved useful to 
determine the popularity of certain types of music or specific 
musical selections with members of musical organizations within 
the school as well as with audiences. The results of such check- 
lists might serve as a basis for developing future programs. 


The Mooney Problem Check List is a commonly used commer- 
cially prepared checklist of problems common to adolescents of 
either junior or senior high school age. It is an instrument to enable 


160 EVALUATING STUDENT PROGRESS 


the teacher or counselor to identify quickly the problems or problem 
areas which concern the student. 

The Mooney Check List presents the student with a series of prob- 
lems and asks him to underline those that trouble him. Sample items 
follow: 


6. Not enough suitable clothes to wear 
13. Lacking a place to entertain friends 
14. Wanting to learn how to entertain 
27. Not mixing well with the opposite sex 
35. Parents sacrificing too much for me 
36. Belonging to a minority religious group 
44. Don't know how to study effectively 
59. Threatened with a serious ailment 
63. Needing money for education beyond college 
69. Slow in getting acquainted with people 
80. Sometimes wishing I'd never been born 
90. Feeling I don't really have a home 
92. Too little chance to develop my own religion 


Mooney lists the following uses for his checklist in the manual: 


I. To facilitate counseling interviews 

l. To prepare students for an interview by giving them an oppor- 
tunity to review and summarize their own problems and to see 
the full range of personal matters they might discuss with their 
counselor or teachers, 

2. To save time for the interviewer by providing him with a quick 
review of the variety of problems which are the expressed con- 
cern of the student, 

II. To make group surveys leading to plans for individualized action 

l. To find out what problems young people are concerned with in 
their personal lives. 

2. To help locate students who want and need counseling or other 
personal help with problems relating to health, School, home so- 
cial relationships, personality, or other personal problems. 

3. To help locate the most prevalent problems expressed within a 
student body as a basis for new developments and revisions in 
the curricular, extracurricular, and guidance programs of a 
School. 

III. As a basis for homeroom, group guidance and orientation programs 


1, To stimulate each student to quicker recognition and analysis of 
his needs, 


CHECKLISTS, RATING SCALES, INVENTORIES 161 


2. To indicate discussion topics and group activities which are re- 
lated to the personal interests and needs of the students in any 
given group. 

IV. To increase teacher understanding in regular classroom teaching 

1. To suggest approaches by which a teacher can establish a more 
personalized relationship with each of his students. 

2. To enable special analysis of students who are hard to “reach” 
or understand. 

V. To conduct research on the problems of youth 

1. To show changes and differences in problems in relation to age, 
sex, social background, school ability, interest patterns, and the 
like. 

2. To discover clusters of associated problems. 

3. To measure changes brought about by a planned problem-reduc- 
tion program. , 


When the student is through checking the items, the summarizing 
process results in a count of checks made in the following problem 
areas. 


College and High School Forms 
330 items, 30 in each area 
I. Health and Physical Development (HPD) 
II. Finances, Living Conditions, and Employment (FLE) 
III. Social and Recreational Activities (SRA) 
IV. Social-Psychological Relations (SPR) 
V. Personal-Psychological Relations (PPR) 
VI. Courtship, Sex, and Marriage (CSM) 
VII. Home and Family (HF) 
IX. Adjustment to College (School) Work (ACW) (ASW) 
X. The Future: Vocational and Educational (FVE) 
XI. Curriculum and Teaching Procedure (CTP) 


In the checklist illustrated in Figure 7, the emphasis is placed 
upon student self-evaluation of specific study skills. The value of 
self-evaluation procedures cannot be underestimated for they provide 
a base for future corrective education. 


2 Ross L. Mooney and Leonard V. Gordon, “Manual (1950 Revisions) The 
Mooney Problem Check Lists" (New York: The Psychological Corporation, 1950), 
pp. 3-4. 


162 


EVALUATING STUDENT PROGRESS 


Name. 


Date. 


This checklist was compiled to help each student discover his own 
weaknesses and strengths in certain study skill habits. How do you rate? 


YES 


FATPSEEPHHESTERRH ERES 


NO 


EET TE TE ERES BEER 


1 


O tn d» 0s t5 


18. 
- Do I read for the purpose of verifying facts and 


- Can I answer a specific question briefly and to the 


point? 


- Can I talk and write without playing on words? 
- Can I make notes from a report? 


Can T use an outline in a problem assignment? 


- Can I “spot” quickly the information I need? 
- Can I gain the general sense of an entire article by 


"spotting" key ideas? 


- Do I focus my mind upon the one thing that I want 


to know? 


- Do I read groups of words? 

- Do I pronounce words silently? 

- Do I linger over details and examples? 

- Do I consult the title page, table of contents, list of 


illustrations, and preface when I inspect a book? 


- Do I use the table of contents and index when locating 


material? 


- Do I know how to use an atlas? 
- Do I know how to use the Reader's Guide to Periodi- 


cal Literature? 


- Do I know how to use an encyclopedia? 
- Do I read for the purpose of securing information? 
- Do I read for the purpose of understanding a situ- 


ation? 
Do I read for the purpose of forming an opinion? 


opinions? 


. Do I read for the purpose of obtaining directions and 


acting upon directions? 


- Do I read for the purpose of forming a basis of judg- 


ment? 


- Do I read for the purpose of evaluating material? 
- Do I note the correct spelling in syllables of the par- 


ticular word that I am spelling? 


- Do I note correct pronunciation? 


Fig. 7. Self-rating checklist 


CHECKLISTS, RATING SCALES, INVENTORIES 163. 


Any problem checklist must meet three requirements if it is to re- 
turn valid results: 


1. Students must be able to recognize their own-problems. 
2. They must find these problems listed. 
3. They must be willing to record these problems. 


RATING SCALES 


A rating scale is a device used to evaluate situations or character- 
istics that can occur or be present in varying degrees, rather than 
merely present or absent as is the case with the checklist. A rating 
scale is an instrument so designed as to facilitate appraisal of a 
number of traits or characteristics by reference to a common quanti- 
tative scale of values. 

Three types of rating scales in common usage are descriptive 
Scales, numerical scales, and graphic scales. In each type provision 
is made for the rater to place each person or thing rated somewhere 
along the range or continuum of each trait selected for rating. 

Descriptive scales provide for each trait a list of descriptive 
phrases, usually from one to seven in number, from which the rater 
selects the one most applicable to the person or thing being rated 
and records his selection usually by means of a check mark. This 
type is illustrated by the Behavior Description devised by the Re- 
ports and Records Committee of the Eight-Year Study of the Pro- 
gressive Education Association. This form describes the character- 
istic behavior of the student in seven important areas: Responsibil- 
ity-Dependability ; Creativeness and Imagination ; Influence; Inquir- 
ing Mind; Openmindedness; Social Concern; and Emotional Re- 
Sponsiveness. 

One area of this form is described as follows : 


Influence: 


Controlling: His influence habitually shapes the opinion, ac- 
tivities, or ideals of his associates. 


Contributing His influence, while not controlling, strongly 
Influence: affects the opinions, activities or ideals of his 
associates. 


164 EVALUATING STUDENT PROGRESS 


Varying: His influence varies, having force when particu- 
lar ability, skill, experience, or circumstance 
gives it opportunity or value. 


—— —-— Cooperating: Has no very definite influence on his associates, 
but contributes to group thinking and action be- 
cause of some discrimination in regard to ideas 
and leaders. 


Passive: Has no definite influence on his associates, being 
carried along by the nearest or strongest influ- 
ence.? 


Numerical rating scales are set up so that the rater assigns a code 
number to each trait of the person being rated. Code numbers are 
assigned to the descriptive phrases, arranged in order of the degree, 
level, intensity, or frequency with which they indicate possession, 
lack, or occurrence of each trait. For example, number 1 (or 0) may 
be synonymous with "never"; 2, *very seldom”; 3, “occasionally” ; 
4, "much of the time"; and 5, “constantly” or “always,” In making 
the rating, the rater places the appropriate number beside each 
trait being rated. 

For example, a teacher may wish to rate all his students on the 
single trait, industriousness, He would, therefore, list all observable 
behaviors by which this trait might be recognized as follows: 


Students’ Names 


PNE 
= m ewe Joe M 
9» a o o Sally R. 


Bom 0 0 t Bill J. 
9» t9 wan a Tom S. 
Aaw a ca An B, 


Then, using the numerical scale described above, he would assign 
a number to each behavior for each student in the class. 


2 Eugene R. Smith and Ralph Tyler, A ppraisin; i 
g and Recording Student P S 
(New York: Harper & Brothers, 1942), Pp. 478-79. v "ur a 


CHECKLISTS, RATING SCALES, INVENTORIES 165 


The graphic rating scale has descriptive phrases printed hori- 
zontally at various points underneath a straight line across the page. 
The rater indicates the subject’s standing with respect to each trait 
by placing a check mark at an appropriate point along the line. 
This type of scale solves the problem of being forced to make rather 
gross evaluations, which must be done when either a descriptive or 
numerical scale with only from three to seven degrees of each 
characteristic is being used. In the graphic scale the degrees of each 
characteristic are arranged so that the rater can make as fine dis- 
tinctions as he wishes. This type of scale is illustrated by the in- 
structions and sample characteristics presented below. 


Instructions: Place a check mark on the line at the point which best 
describes this person. The descriptions below the line indicate variations 
in the characteristic being rated. You are not required to place your 
check marks only at these descriptive points. You may check anywhere 
along the line. 


CHARACTERISTIC RATING SCALE 


Workmanship 
| 


Sloppy worker. Frequently Occasionally Excellent worker. 
Careless about makes mistakes does fine work. Takes care of 
use of tools. and is careless Needs help in tools in profes- 
Needs constant in use and han- looking after sional manner. 
supervision. dling of tools ^ tools at times. No need for 
supervision. 


Another example will serve to illustrate how a simple scale may 
be used to get the opinion of students on the helpfulness of con- 
ferences with teachers or counselors in the selection of high school 
courses, 


Directions: Place a check mark on the line at the point which shows 
how helpful you think your conferences with your counselor have been 
in helping you select your high school courses. Place the mark anywhere 
on the line between “Very Helpful" (100) and “Of No Help" (0). 


Very Helpful Helpful Of No Help 


100 75 50 25 0 


166 EVALUATING STUDENT PROGRESS 


Rating scales may also be divided into three types: self-rating 
scales, scales for rating others, and scales that do both. Regardless of 
the type, their purpose is to translate impressions of people or things 
into quantitative terms. They seem to work best for judging be- 
havior, performance, or product that is easily observable. 

Although there are standardized rating scales of various types on 
the market, they are not very popular, primarily because there is not 
much in the way of normative data to recommend their purchase 
and because they can be constructed to meet the local needs of the 
teacher or school. 

Although rating scales fulfill a real need in the field of secondary 
school evaluation, they should be used to appraise only those 
qualities for which no valid objective measures are available. They 
are intended to supplement, not supplant, more objective devices. 

Since rating scales are but tools in the hands of the teacher or 
other rater, they are no better or worse than the skill or ability of the 
rater using them. The most common faults in using the rating scale 
are as follows: 


1. Inadequacy or errors in observation of the behavior, quality, or 
performance occur. People simply fail to see or hear all that is sig- 
nificant in a product or performance. This is especially true when 
the observer (rater) is inexperienced in the area of the observa- 
tion. ( 

2. Personal bias or prejudice of the rater enters into the rating causing 
him to exaggerate certain features and minimize others in terms 
of his preconceived ideas and notions. This error usually works 
with groups and tends to make the ratings of one rater typically 
eum or lower than the ratings of other judges rating the same 

raits. 

$. The "halo effect" develops and causes the rater to be influenced in 
his rating of several traits by his attitude toward one trait of the 
individual whom he is rating. For instance, Mr. Johnson may be 
So greatly impressed by Roy’s musical ability that he rates him 
high in athletic and academic skills as well as in social adjustment, 
even though such ratings are not actually warranted by the facts. 
The “halo effect" is usually operative when a judge rates a person 
the same, or nearly the same, on all traits. 

4, The rather common tendency exists to avoid extremes in ratings, 
that is, rating apparently inferior traits or skills higher than they 


CHECKLISTS, RATING SCALES, INVENTORIES 167 


deserve and rating very superior characteristics or qualities lower 

than is warranted. This may be explained by several conditions: 

a. The observer may not be aware of the extreme ranges existing 
in the traits he is judging. 

b. He may not be sensitive to extreme variations in the traits 
being considered. 

c. He may be personally insecure and not risk extreme ratings. 

d. He may feel inadequate or incompetent to judge fairly and 
thus stick to a safe middle ground where his ratings cannot 
seriously hurt anyone. 

5. Misinterpretation of the meaning of the trait or quality being 
rated is made. Unless great care is taken to define clearly each 
trait and degree of each trait, it is entirely possible for individual 
judges to give completely different meanings to the items in ques- 
tion. Qualities and characteristics such as “cooperation,” “work- 
manship,” “responsibility,” and “democratic” may have entirely 
different meanings to different people. 

6. The “generosity error” occurs when the judge, not being certain 
about the existence, meaning, or degree of a certain trait or 
quality, rates that particular item rather high, giving the subject 
the benefit of the doubt. 

7. Units on the scale are not equal at all points along the continuum. 
This means that a small difference in the rating of a trait possessed 
by two people at the extreme “top” or “desirable” end of the scale 
may actually be more meaningful or indicate a greater variation 
than an equal scale difference at the middle or bottom of the 
scale. 


In addition to the above errors common to the use of the rating 
scale, Hahn and MacLean have listed several generalizations, based 
upon research evidence, concerning errors in ratings obtained with 
Scales: 


a. Self-ratings tend to be high on desirable traits and to be low on 
undesirable ones. 

b. One tends to rate his own sex higher than the opposite sex on de- 
desirable traits, the reverse being true on undesirable traits. 

c. Men are more lenient in their ratings than women. 

d. In self-ratings, superior individuals underestimate themselves and 
inferior individuals overrate themselves, the latter having the 
greatest error. 


168 EVALUATING STUDENT PROGRESS 


e. Parents overrate their children as a rule, but they underestimate 
superior children. [Although not supported by research. evidence, 
it is the author’s opinion that the same error is made by teachers 
who tend, in general, to overrate their own pupils, but under- 
estimate superior pupils, even in their own class or group.] 

f. Two ratings by the same judge are no more valid than one.* 


As was stated earlier, most of the errors in ratings are due not to 
the scale itself but to the rater using the scale. Consequently, the 
following suggestions are offered to reduce or minimize these errors: 


1, All items to be rated should be clearly defined, using specific be- 
havioral descriptions where appropriate. 

2. Interchange the top and bottom ends of the scale to keep the 
raters alert and prevent checking from becoming automatic. 

3. Keep the number of characteristics to be rated reasonably small, 
usually no more than five or six. 

4. The rater should be permitted (even encouraged) to specify on 
which of the traits he is competent or not competent to make a 
rating. 

5. Rate one trait for all students in the class before going on to the 
next trait. This tends to reduce the “halo effect.” 

6. Have two, three, or more raters rate the same traits whenever pos- 
sible. A composite or average rating tends to eliminate individual 
errors and makes the rating more reliable. 

7. Compare means (averages) of ratings of different teachers of the 
same students to show up consistent errors of over- or underrating 
by certain teachers in comparison with others. 

8. Educate teachers through in-service programs to become more 
objective in their observations and to guard against prejudice; to 
agree on the meanings of traits to be rated, and to develop a thor- 
ough understanding of and belief in the rating program. 


Even though research has shown that ratings are more reliable 
when they represent the pooled judgments of a number of raters, 
there is value from the guidance point of view in retaining the 
individual judge’s ratings to show both agreement and variability 
in their opinions. 

It is entirely possible for two students each to obtain a pooled 
rating of average by three judges on any given trait, even though one 


E Milton E. Hahn and Malcolm S. MacLean, General. Clinical Counseling in Edu- 
cational Institutions (New York: McGraw-Hill Book Co., Inc., 1950), p. 163. 


CHECKLISTS, RATING SCALES, INVENTORIES 169 


student’s average may result from an “average” rating from all 
three judges, whereas the other student’s average may be made up of 
an “average” rating by one judge, a “below average” rating by the 
second judge, and an “above average” rating by the third judge. 
Thus, the first student’s average rating would only reflect accurately 
the appraisal of one of the three judges. It is quite possible in the 
case of the second student that each judge could be correct in his 
appraisal based upon his experience and observation of the indi- 
vidual in question; yet the individuality of the ratings is lost if 
averages are computed. “To avoid such difficulties, many schools 
use a method of pooling that portrays all ratings. In this method, 
each rater’s evaluation is displayed in a single table. This table is 
illustrated in Figure 8. Note that each teacher’s rating is identified 
by placing his number, 1, 2, or 3, in the appropriate column. 


RATINGS * 
NAME OF STUDENT 
Below Ain Above 
Average TENA Average 
Adams, Joe 1 3 2 
Hand, Mary — 1,2;3 — 
Walsh, Oscar 2 — 1,8 


: * Put number of person making rating in column. Identify this person by record- 
ing his number and name below. 


1. Miss Smith 2. Miss Wells 3. Mr. Yound 


Fig. 8. Summary of ratings in cooperation 


This plan, or a modification thereof, makes it possible to pool the 
judgments of raters without obscuring the differences in their evalu- 
ations.” * 


INVENTORIES 


The term inventory is sometimes used synonymously with check- 
list and/or questionnaire. In reality, it actually is very difficult to 


* Clifford E, Froehlich and John G. Darley, Studying Students—Guidance Meth- 
ods of Individual Appraisal (Chicago: Science Research Associates, 1952), p. 115. 


170 EVALUATING STUDENT PROGRESS 


differentiate between them. If differences do exist, they are more 
artificial than real. Two well-known evaluation instruments in the 
area of personality adjustment are very similar in construction and 
identical in purpose, yet one is called an inventory (SRA Youth 
Problems Inventory) and the other a checklist (Mooney Problem 
Check-List). Furthermore, any prepared list of statements or ques- 
tions presented to an individual to which he responds in some way 
is a questionnaire. Thus, both the checklist and the inventory can 
be classified as questionnaires. 

By way of differentiation, however, the inventory is usually longer 
than the checklist, is more comprehensive, is not ordinarily teacher- 
made, and is based upon more intensive research into the topics or 
areas being investigated (therefore, likely to be more valid). In 
contrast to the questionnaire, it may be said that, whereas a single 
inventory usually covers just one topic or area (interest, attitude, 
appreciation), a questionnaire may sample opinions or information 
in a variety of categories. The questionnaire, like the checklist, is 
more likely to be made to order than is the typical inventory in 
order to meet the unique needs of the person who will use it. 
‘Further, with regard to format of the instrument itself, the responses 
to the various items of the inventory are usually all made in the 
same way, i.e., by check mark or other simple method, while the 
questionnaire may require that information be supplied by check 
mark, sentences, or even paragraphs in some cases. There are nu- 
merous exceptions to the above differences, however. It is unfortunate 
that so much confusion exists in the application of these terms to 
specific instruments. A set of practical criteria for classifying these 
instruments would be very useful not only to the teacher but to the 
researchers as well. The term inventory has been applied generally 
to devices which aid the teacher studying the interests, attitudes, 
and personality (including behavior and problems) of students. 

Inventories help in analyzing many factors influencing school 
work and in developing remedial procedures. Humphreys and Trax- 
ler say that any inventory serves at least three purposes : 


‘aay enables individual students to determine exactly what their 
specific problems and problem areas are. 
b. It gives students a recognition that they have common problems. 


CHECKLISTS, RATING SCALES, INVENTORIES 171 


c. It provides the school or college with information about problems 
of students both as individuals and as a group.” 


Vocational interest inventories may be classified in three cate- 
gories, or types, according to the degree of generality which the 
Scores reveal: 


1. Interest in specific occupations, such as the Strong Vocational In- 
terest Blank (Men or Women). 

2. Interest in families of occupations, illustrated by the Cleeton Vo- 
cational Interest Inventory, the Brainard Occupational Preference, 
and the Lee-Thorpe Interest Test. 

3. Interests in broad fields which may cut across vocational groups, 
for example, the Kuder Preference Record— Vocational. b 


Such inventories are of considerable value to the student to help 
him improve his self-understanding in the area of his likes and 
preferences, and to the teacher by providing him with a knowledge 
of the interests of his students as a basis for guidance and instruc- 
tion. 


It must be borne in mind when utilizing interest inventories that. 
the results obtained are but one indication of the student's interest, 
the other evidences of specific interests being found in his— 


Hobbies or leisure time activities. 

Verbally expressed interests or likes. 

Reading preferences. 

Choices of elective school subjects and activities (music, drama, 


art, forensics, athletics, and clubs). 


PONE 


The teacher using interest inventories with his students would also 
do well to remember that most such instruments do not provide a 
measure of the intensity of the interest(s) of the individual, but 
merely canvass the various interests in which he expresses some 
interest. 

Personality, adjustment, and behavior inventories, as well as 
problem inventories, are all designed to help the student and/or 
the teacher to gain greater insight into the student’s mode of acting, 


5 J. Anthony Humphreys and Arthur E. Traxler, Guidance Services (Chicago: 
Science Research Associates, 1954), p. 196. 


172 EVALUATING STUDENT PROGRESS 


thinking, and feeling. They ask the student to respond, as ob- 
jectively as possible, to items probing his behavior, his likes and 
dislikes, his environment, his fears, his problems, and many other 
aspects of his life. The purpose of using each instrument is to help 
the teacher or counselor to guide more adequately the personal 
development of the student. 

Most personality and adjustment inventories are designed to yield 
data on a student’s adjustment through his own responses to a num- 
ber of questions or statements. Therefore, such an instrument may 
be considered a self-inventory, which is essentially a system of self- 
rating whereby a pupil records his opinion of himself in areas 
relevant to his emotional and social adjustment. Most questions or 
statements are subjective in the essential sense that only the indi- 
vidual himself can render an opinion concerning them, since they 
deal with his own inner experience, observable by no one but himself. 

In general, the self-inventory technique of evaluating social and 
emotional adjustment may usually be considered desirable and 
helpful, but only seldom even nearly sufficient. The speed, efficiency, 
and low cost of evaluation by the self-inventory method make it 
likely that, in the hands of the intelligent and well-rounded teacher, 
it may frequently yield worth-while results and quick insights that 
could otherwise be attained only through far more laborious and 
demanding procedures. The reliability or consistency of the results 
obtained with self-inventories is usually far greater than that ob- 
tained with any of the other available methods, such as rating de- 
vices, observational techniques, interviews, or descriptive records. If 
used for screening out a large number who are maladjusted and in 
need of guidance, it can usually be valuable. 

The validity of a self-inventory is dependent upon the content, 
interpretive hints, the conditions under which administered, and, to 
a great degree, upon the rapport between pupil and teacher or ex- 
aminer. With good rapport, the examiner has to (1) select the best 
possible inventory for his particular conditions, problems, and facili- 
ties, and (2) interpret the evidence in accordance with both the 
author's instructions and his own common sense, psychological in- 
sight, and training. A teacher should have an adequate background 


in the fields of personality and mental hygiene before interpreting 
a self-inventory. 


CHECKLISTS, RATING SCALES, INVENTORIES 173 


It should be remembered that all types of personality inventories 
are subject to error because— 


1. The makers of the instruments are not agreed as to the major as- 
pects of personality. 

2. They are liable to be low in reliability and inconsistent over a con- 
siderable period of time. 

3. The correlation between the way the student says he acts, thinks, 
or feels, and the way he actually thinks, feels, or acts may be very 


low. 
4. The student may not feel that it will be advantageous to him to 


give truthful responses. 


The behavior inventory, as constructed and explained by Torger- 
son, is a device to assist teachers in making more effective obser- 
vations of their students as a basis for identifying those that are 
maladjusted or in need of some type of remedial or therapeutic assist- 
tance. The inventories list some two hundred undesirable or faulty 
behavior manifestations frequently found in children in the follow- 
ing classifications: social adjustment, health and physical status. 
reading, arithmetic, spelling, hearing, vision, speech, and scholar- 
ship. This list, with some modifications, is presented here to acquaint 
the teacher with some of the more common observable behavior 
symptoms which are usually associated with student problems. Of 
course, no single symptom or condition should be considered sig- , 
nificant unless it is persistent, frequent, or unusually severe or seri- 
ous, in other words, somewhat characteristic of the student. 


BEHAVIOR INVENTORIES 
SCHOLARSHIP 


Work Habits 


l. Inability to plan and outline 

2. Inability to budget time effectively 

3. Inability to maintain interest 

4. Inability to concentrate on work—wasting of time 
5. Inability to “get started" on assignments 


è 
ê Theodore L. Torgerson, Studying Children (New York: The Dryden Press, 
1949), pp. 52-76. 


174 EVALUATING STUDENT PROGRESS 


Study Skills 
6. Very slow reading rate 
7. Failure to comprehend text 
8. Inefficiency in use of index 
9. Inability to read maps and graphs 
10. Inefficiency in use of library 
11. Inefficiency in use of dictionary 


Speaking Vocabulary 
12. Vocabulary very limited 


Achievement 
13. Below average marks in subjects 
14. Unsatisfactory marks in subjects 
READING 
Slight Vocabulary 
1. Faulty word recognition 6. Confusion of words 
2. Repetition of words 7. Addition of words 
3. Miscalling of words 8. Skipping of words 
4. Guessing at words 9. Faulty mastery of basic 
5. Confusion of letters skills 
Word Analysis 
. 10. Mispronunciation 13, Reversal of letters 
ll. Inability to sound letters 14, Reversal of syllables 


12. Refusal to attempt difficult words 15. Reversal of words 
Meaning Vocabulary 

16. Meaning vocabulary inadequate 
Comprehension 


17. Inability to recall what he reads 
18. Inability to understand what he reads 
19. Inadequate phrasing 


Rate 


20. Word by word reading 
21. Reading rate too slow 


Interest 
22. Dislike of reading 


CHECKLISTS, RATING SCALES, INVENTORIES 175 


SPELLING 
1. Addition of letters 
2. Omission of letters 
3. Substitution of letters 
4. Transposition of letters 
ARITHMETIC 
Deficient in: 
Skills 
1. Number facts 4. Two- and three-place multipliers 
2. Column addition 5. Long division 


3. Carrying and borrowing 6. Reading and writing of numbers 


Fractions 
7. Addition of fractions 11. Proper fractions 
8. Subtraction of fractions 12. Improper fractions 
9. Multiplication of fractions 13. Mixed numbers 


10. Division of fractions 14. Reduction of fractions 
Decimals 
15. Addition of decimals 18. Division of decimals 


16, Subtraction of decimals 19. Reading and writing of decimals 
17. Multiplication of decimals 
Percentages 
20. Problems in percentage 
21, Expression of decimals as per cents 
22. Expression of per cents as decimals 
Problems 
23. Written problems 


VISION 


Acuity Far Point 
l. Inability to see blackboard distinctly 
2. Holding book too close to eyes 
3. Holding book too close to desk 


176 EVALUATING STUDENT PROGRESS 


Acuity Near Point 


4. Confusion of words and letters 

5. Holding head to one side 

6. Covering or closing one eye when reading 
7. Frowning when reading 


Discomfort 


8. Eyelids inflamed or swollen 

9. Eyeballs inflamed 
10. Discharge from eyes 
11. Pain in and about eyes 
12. Pain at back of neck 
13. Headaches after reading or movies 
14. Eyes sensitive to light 
15. Eye fatigue when reading 
16. Unwillingness to wear glasses 
17. One eye turning in (squint) 
18. Trembling or twitching of eyes 


HEARING 


Acuity 
1. Inability to hear questions first time 
2. Imitation of other pupils 
3. Apparent confusion 
4. Daydreaming 
5. Faulty speech 
6. Unintelligible speech 
7. Speaking in a montone 
8. Voice overloud or oversoft 
9. Use of symbolic gestures in lieu of words 
10. Language handicap 
11. Very intent listening 
12. Ignoring of verbal instructions 
13. Reading of lips, watching of faces 


Ear Trouble 


14, Spells of dizziness 17. Discharge from ears 
15. Noises in ears : 18. Earaches or mastoid pains 
16. Excess of wax in ears 19. Previous mastoid operation 


CHECKLISTS, RATING SCALES, INVENTORIES 


177 


Excessive height 
Retarded stature 


Blank spells 

Fainting spells 

Nervous mannerisms, tics 
Puffiness of eyes and face 
Swollen hands and feet 


. Sallow complexion 


Listlessness, fatigue 


"Falling asleep in school 


Frequent absence due to 
illness 


History of rheumatic fever 
History of scarlet fever 
No immunization against 
disease 


HEALTH 
Physical Development 
1. Obesity 3 
2. Excessive underweight 4. 
Minor Ills 
5. Mouth breathing 16. 
6. Frequent severe colds 17. 
7. Frequent sore throats 18. 
8. Chronic cough 19. 
i 9. Poor teeth 20. 
10. Sore gums 21 
11. Swollen glands in the neck 22. 
12. Protruding eyeballs 23 
13. Dry, scaly skin 24. 
14. Frequent itching 
15. Convulsions, fits 
Handicaps 
25. Faulty posture 29. 
26. Awkward gait 30. 
27. Lameness 31. 
28. Partial paralysis 
L SPEECH 
Vocalization 
1. Speech handicap causing him to remain silent 
2. Overloud speech 
3. Oversoft speech 
4. Annoying vocal quality 
5. Lack of variety in vocal patterns 
6. Tiresome repetitions in vocal inflections 
7. Voice suggestive of person of different age or sex 
8. Voice not characteristic of individual 
Articulation 
9. Speech too slow 
10. Speech too rapid 


11. Omission or elision of sounds 


178 


12: 
13. 
14. 
15. 
16. 
17. 
18. 
19. 


EVALUATING STUDENT PROGRESS 


Addition of superfluous sounds 

Substitution of one standard English sound for another 
Substitution of an unusual sound for a standard English sound 
Difficulty in understanding his pronunciation of certain words 
Clumsy speech 

Speech unduly laborious 

Emphasis on delivery rather than meaning of speech 
Distracting movements of lips or tongue 


Rhythms 


20. 
21. 
22. 
23. 
24. 


Linguistics 
25. 
26. 
27. 


28. 
29. 
30. 
31. 


o OU ERA CRISE 


Occasional blocking of speech 

Blocking of speech by stoppage of air flow 

Blocking of speech by restricting movements of tongue or lips 
Unnecessary repetition of certain sounds 

Distracting movements of head, face, shoulders, hands, etc., dur- 
ing speech block 


Difficulty in understanding simple oral directions 

Difficulty in understanding simple written directions 

Difficulty in understanding the meaning of his thought, although 
the words are clear 

Difficulty in recalling names of common objects 

Use of signs and gestures to express his wants 

Difficulty in recognizing simple words when spelled for him orally 
Difficulty in learning to read, write, or spell 


SOCIAL BEHAVIOR 


Quick anger 14. Quarrelsome attitude 
Temper tantrums 15. Cruelty to animals 

Lack of cooperation 16. Irritability 

Sex irregularities 17. Belligerence 

Uncontrolled bladder or bowels 18. Bullying of others 
Enuresis (bedwetting) 19. Vindictiveness 

Truancy, unexcused absences — 20. Stealing 

Cheating 21. Dishonesty, untruthfulness 
Resentment of correction 22. Marked personality change 
Destructive tendencies 23. Negative attitude 


- Overcriticism of others 24. Running away from home 
. Irresponsibility 25. Seeking undue attention 
. Impudence, defiance 26. Overconscientious attitude 


27: 
28. 
29. 
30. 
3T. 
32. 
33. 


34. 
35, 
36. 


37. 


CHECKLISTS, RATING. SCALES, INVENTORIES 179 


Emotional inadequacy 
Procrastination 
Pessimism 

Whining 

Suspicion of others 
Isolated play 

Avoidance of others, un- 
friendly attitude 
Ostracism 

Overreligious attitude 
Daydreaming, preoccupa- 
tion 

Preference for play with 
younger children 


38. 
39. 
40. 
41. 
42. 
43. 
44. 
. Nervous tension, tics 

. Biting of fingernails 

. Fearfulness, timidity, shy- 


Physical cowardice 
Selfishness 

Feigning of illness 
Oversubmissive attitude 
Depression 
Overdependency 
Sullenness 


ness 


. Anxieties 
. Jealousy 


Tendency to cry easily 


All of the sections of the inventory are applicable to grades 1 to 
9, while all except spelling and arithmetic are applicable to grades 


10 to 


12. 


Torgerson adds a number of cautions to be kept in mind when 
using the inventories: 


aerate 


Do not guess. An inaccurate record is useless and misleading. 
Do not infer anything which you cannot observe or determine. 


Record 0 if you do not know. 


Avoid making a record solely on the basis of memory. Keep a log 


of observations. 


An oversight or failure to report an observation may deprive the 


child of the help he needs. 


The final record should be based upon extended observations, 


An individual inventory may be scheduled annually (in place of 
a planning interview) by the student’s adviser or homeroom teacher 
in order to obtain information to understand the student better and, 
therefore, to help him more effectively. Since no two students are 
exactly alike, the teacher can gain a great deal of insight into each 
one through this personal approach by covering such points as “my 
reaction to high school,” “my subject likes (or dislikes)," “my future 
educational (or vocational) plans," “my course of study for the 
next year (high school or college),” and “my social problems.” 


115. 
116. 
117. 


155. 
156. 
157. 


208. 
209. 
210. 
211. 


240. 
241. 


242. 


MY SCHOOL ° 


. Ihave difficulty keeping my mind on my studies 
. I wish I knew how to study better 
. I wish I knew more about using the library 


AFTER HIGH SCHOOL? 


. What are my real interests? 
. What shall I do after high school? 
. For what work am I best suited? 


ABOUT MYSELF 


. Fm easily excited 
. Ihave trouble keeping my temper 


- Iworry about little things 


GETTING ALONG WITH OTHERS 


I want people to like me better 
I don’t know how to introduce people properly 
I wish I could carry on a pleasant conversation 


MY HOME AND FAMILY 


Ihave no quiet place at home where I can study 
Ican't get along with my brothers and sisters 
There is constant bickering and quarreling in my 
home 


BOY MEETS GIRL 


I seldom have dates 

T don’t know how to ask for a date 

"There is no place to dance in the town where I live 
I'm bashful about asking girls for dates 


HEALTH 


T want to gain (or lose) weight 

I want to learn how to select foods that will do me 
the most good 

Ismoke too much 


Fig. 9. Major sections and selected items from the 
SRA Youth Inventory, Form A 


180 


CHECKLISTS, RATING SCALES, INVENTORIES 181 


The outstanding example of the problem type inventory is the 
SRA Youth Inventory, Figure 9, which is now available in two forms, 
the Junior Inventory for children in grades 4 through 8, and the 
Youth Inventory for adolescents in grades 7 through 12. 

The manual accompanying the Youth Inventory describes it as “a 
check-list of 298 questions designed to help teachers, counselors, and 
school administrators identify quickly the problems young people 
say worry them most,” and goes on to list the values of the inven- 
tory to the student and to the teachers. The values to the student are 
stated as follows: 


1. It helps the student focus attention on those things which are of 


concern to him. 
2. It tends to give him a perspective and to establish an order of pri- 


ority for attacking his problems. 
3. It helps motivate the student to seek the solution of problems he 


can handle himself. 
4. It encourages him to seek help in working out more difficult prob- 


lems. 
As for the values to the teacher, the manual says: 


The teacher will find the inventory provides information that he needs 
to better understand his students. He can tailor instruction to general 
problems of the class; he can identify the needs of individuals; he can 
identify a need for supplementary teaching aids . . .; he can measure 
change in the group by administering the Inventory annually,” 


QUESTIONNAIRES 


The questionnaire is a series of questions or statements designed 
to secure responses from individuals concerning their interests, atti- 
tudes, opinions, and judgments, as well as strictly factual informa- 
tion (see Fig. 10). Questioning an individual is perhaps the most 
natural method of obtaining information from him. Of course, the 
individual’s answers are influenced by his willingness to tell the 
truth, his interpretation of the questions, and the accuracy and ex- 
tent of his knowledge of the facts. 


" «Examiner Manual for the SRA Youth Inventory” (Chicago: Science Research 
Associates, 1953), p. 11. 


Fig. 10. Sample 


FOLLOW-UP OF HIGH 
CONFIDENTIAL 
We Are Interested in You—And 
Year graduated 
Name of high school 


. Name 


Girls, if married give maiden name 
- Present address 


3. Sex MURS ee 
. Marital status: 

irs o MP NR s Divorced 

Married .. — .— Separated 


Widowed 


. Employment: (Check those applicable) 
Employed full time 
Employed part time 
Unemployed—seeking work 
Armed Services 
Housewife 
In school full time 
Other 


- What is your work? 


- If you are employed for wages, how did you obtain your job? Through: 
Family 
Friend 
Employment agency 
High school staff 
Newspaper 
Found it myself 
Other 


- To what extent is your present job like the type of work you thought you 
would follow when you left high school? 


Didn’t have definite choice 
Not related 

Closely related 

The type of work I wanted 


questionnaire 


SCHOOL GRADUATES 
REPORT 
You Can Help Us Help Others 


Name of City 


9. What part of your high school education helped you most in your present 
position? (Please name) 


General Vocational Extra Curricular 
Studies Studies Activities 


10. How well satisfied are you with your present job? 
Satisfied 
Moderately satisfied 
Indifferent 
Dissatisfied 


11. To what extent has the counseling you received in high school been help- 


ful to you? 
Extremely helpful 
Some help 
Very-little help 
Not helpful at all 
Didn’t have any in school 
12. What occupation do you hope to follow? 


13. To what extent did your high school experience give you useful informa- 
tion in the following fields? 


Little Some Much 


In development of salable skills 
In developing and maintaining 
your health 

Civic and world affairs 

Marriage and family relationships 
Economic competency (Handling 
money) 

How to “keep up” in a scientific 
world 

Appreciation of the beauty in 
music, art, literature and nature 
Intelligent use of leisure time 
Getting along with others 

Self expression through speech 
and writing 


183 


CONFIDENTIAL 


14. Which was the most difficult problem you had to meet since graduating 


from high school? 

Holding a job or employment 
Making friends 

Military service 

Boy-girl relationships 
Adjusting to marriage 
Further education 

Moral and spiritual 

List other problems 


15. What changes in courses or activities in the high school you attended do 


16. 


17. 


you feel would help the school better prepare other students? 


SCHOOL 


Have you attended any of the following types of schools since high 

school? 

a. College (Name) 
Dates attended 
Major subject 


b. Business College (Name) 


(Dates) 


c. Trade or Technical School 
(Name) 
(Dates) 

d. Nurse Training (Name) 
(Dates) 
ee Ee ee yu v oue 

e. Others 
(Dates) 


EAT EET UE DTERNES TT T 
f. How well do you believe you were prepared for the institution you 
attended? 


Walls conn Fairly well — Booy es 


COMMUNITY AFFAIRS 


Have you been interested enough to do any of the following since gradu- 
ation from high school? 


a. Vote Yes No. = 
b. Voluntary work for: Red Cross Wes ts B25 No, 


184 


REPORT (cont.) 


18. 


Boy or Girl Scouts Mesum Dor TN LELT CR 
YMCA. Yes. None: Y, 
YWCA = pres AE) Nome a 


Other Similar (Name) 


c. Write to your Congressman or State Legislator? 


Yes. Noo EE 
d. Take an active part in any political campaign? 
Yes. No. 


e. To what service club do you belong? 
(Kiwanis, Rotary, etc.) 


RECREATIONAL INTERESTS 
What do you do most during your free time to relax? 
a. Participate in sports (Name) 


b. Watch Sports (Name) 


c. Read: 
Magazines (List) 


Newspapers 


Books (List any read in past year) 


d. Watch television (Favorite programs) 


Go to movies 
Go to dances 
. Handicrafts (Do-it-yourself projects) 
. Gardening 

Musical activities 


rpu mo 


j Loaf 
k. Other (Name favorite) 


186 EVALUATING STUDENT PROGRESS 


Questionnaires have been prepared and are available commercially 
for many purposes, such as measuring the educational background 
of students, the socio-economic status of the home, the nature of the 
student’s adjustment to various aspects of his environment, student 
attitudes and interests, and the like. In addition to the standardized 
questionnaires, the teacher, counselor, administrator, and researcher 
find many day to day uses for questionnaires which they themselves 
construct for specific purposes. Since the validity of the answers ob- 
tained from a questionnaire is in direct proportion to the quality of 
the questionnaire itself, it is essential that the instrument be made 
as good as possible. A hastily constructed, poorly conceived instru- 
ment is likely to result in inaccurate and untrue responses. A good 
questionnaire requires as much thought and care in construction 
as a good test. The Encyclopedia of Modern Education recommends 
thirteen steps in questionnaire development : 


1, The technique should be used only when there's no other feasible 
means of securing the required information. 

2. It is wise to use some form of motivation or appeal. 

3. Every question should be carefully checked for significance and 
lack of ambiguity. 

4. Long involved questions should be avoided. 

5. Leading questions should be avoided. 

6. “Cross-checking” questions may be worth including in order to 
check on consistency of response, 

7. Mechanical features should provide a pleasing format with ade- 
quate space for answers. 

8. Responses or entries required on form should be of as simple 
types as possible. 

9. Technical terms which may be variously interpreted should be 
adequately defined. 

10. Anonymity frequently leads to freer responses, 

li. Replies should be in form capable of necessary statistical anal- 
ysis. 

12. Questions Should be short as possible, since excessive time re- 
quired to answer questions leads to careless responses, omission 
of items, and fewer returns. 

13. Questions should be subjected to one or more preliminary trials 
with representative samplings of the individuals for whom ques- 


CHECKLISTS, RATING SCALES, INVENTORIES 187 


tions are designed in order to detect limitations that might other- 
wise be overlooked.* 


Nixon presents a series of practical suggestions as guides in the 
actual construction, development of physical form, and final publi- 


' cation of the questionnaire prior to its submission to respondents.’ 


His suggestions are particularly valuable for the investigator who is 
carrying on research on a more expansive plane than that of the 
classroom. The following was adapted from Nixon's analysis: 


I. Importance of form and appearance. More answers will be received 
when questionnaire forms are attractively presented and are easy to 
read and mark. 

A. Paper and ink 

1. High quality paper should be used. 

2. The form should be printed if possible. 

3. The use of various colors may enhance the appearance. 

B. Arrangement 

1. The consecutive numbering of each question is recommended. 
The number assigned should appear on every page of the 
questionnaire in the same location, preferably near the upper 
right-hand corner. 
a. This gives the investigator an accurate record of all ques- 

tionnaires. 
b. Tabulation is facilitated. 
c. Each number should be placed on two questionnaires for 
follow-up use. 

2. Space should be provided for the— 
a. Respondent's name and title. 
b. Organization and location. 

3. Another technique for increasing the number of returns is to 
offer a summary of the findings of the study. 

4. The title of the study should show prominently near the top 
of the first or covering page. 

5. It is recommended that the words “questionnaire” or *check- 
list" should not appear on the form at all because they may 
not be considered seriously. Use “form” or “instrument.” 


8 Encyclopedia of Modern Education, ed. Harry N. Rivlin (New York: The 
Philosophical Library of New York City, 1943), p. 644. 

John E. Nixon, “The Mechanics of Questionnaire Construction,” Journal of 
Educational Research, 47 (March, 1954), pp. 481-87. 


188 EVALUATING STUDENT PROGRESS 


6. State the time in which it is expected that the form can be 
completed. 

7. Complete information about the recorder or compiler to whom 
the form should be returned, including full name, title, and 
complete address even if self-addressed envelope is enclosed, 
should be included. 

8. The first page should contain a request for the return of the 
completed form. 

Il. Directions 
A. The questionnaire should be kept simple. 

B. Items requiring different types of responses should not be in- 

cluded in the same section of the form. 

C. Answers should be able to be checked off, rather than require 

written comment. 

D. Each new section of the form should begin with brief instructions 
if necessary, then be followed with a sample response, if required 
for clarity. 

. Yes-No replies should be arranged vertically rather than hori- 

zontally. 

F. On the bottom of the page the word over should be inserted if 
there are more items for completion on last page or on the back 
of the page. 

G. Space for comment should be left after each major item. 


E 


The above suggestions for creating and using a questionnaire aim 
at making it as comprehensible, interesting, and practical as possible, 
with the ultimate objective being the return of as many question- 
naires fully and truthfully answered as possible. 

The questionnaire probably has its most frequent use in gathering 
information. It has been criticized as being invalid, but it does save 
time. Many people through the use of a questionnaire can furnish 
information in the same time required for one person to give the same 
information personally to a teacher or counselor. In such a situation 
it is important to keep the information requested as objective and 
impersonal as possible. 

It is the duty of the school counselor to acquaint himself in detail 
with all the important information about each student. A great deal 
of this information will probably be found in a complete cumulative 
record. If the elementary school did not keep a cumulative record, 
however, or if the record should be incomplete, the counselor may 


CHECKLISTS, RATING SCALES, INVENTORIES 189 


give a questionnaire to each student in order to obtain a core of 
objective data. Such questionaires, if carefully prepared, are efficient 
tools in securing information about students’ backgrounds, experi- 
ences, economic possibilities and limitations, plans or aims for the 
future, and other similar data. 

Questionnaire forms may be devised as means of obtaining from 
the individual an appraisal of himself by himself in various trait 
areas. In this connection, interest inventories, as discussed previously, 
may be called interest questionnaires. It should also be pointed out 
that standardized tests are in effect questionnaires administered to 
individuals under prescribed conditions. 

Questionnaires may be used in making a general survey of possible 
maladjustments within a classroom. In this case it would be prefer- 
able to use an informal and unsigned questionnaire. The responses to 
questions on such a blank show the areas in which remedial work 
is needed. The responses also may reveal to the teacher many atti- 
tudes and opinions of which he might not otherwise be aware. 

The questionnaire may also be used to discover the extent of the 
student’s knowledge in such areas as educational and vocational 
fields. 


SUMMARY 


Checklists, rating scales, inventories, and questionnaires are all 
helpful supplements to tests and other devices, structured and un- 
structured, used in the school’s evaluation program. The results ob- 
tained through the use of any of these instruments should not be ac- 
cepted as the final answer to any evaluation problem, but rather as 
additional evidence to be used in arriving at conclusions or diagnoses, 
and to check on the validity of other data. 

Since each of these informal instruments can be utilized with 
students, teachers, parents, or laymen, they add desirable versatility 
to the evaluation tools of the school. 


CHAPTER 
10 


Use of Observation, Anecdotal Records, 
and Interviews 


OBSERVATION 


FRoM THE BEGINNING of time, man’s senses (especially his eyes and 
ears) have provided him with the information about his environment 
upon which he based his evaluation of people, places, things, and 
events. Observation is the most common of all the evaluative tech- 
niques used in teaching, but it must be done purposefully and used 
with understanding if it is to be of the greatest value to the teacher 
and the student. Too often what is referred to as observation is, in 
reality, only Seeing. The teacher sees his students whenever they are 
in his classroom, but does he really observe them? Usually not. 
. What, then, are the differences between "seeing" and "observing"? 
First, seeing is Beneral, observing is specific; second, observing is 
purposeful and goal-directed, seeing is casual; third, seeing may 
involve perception, observing must involve perception; fourth, ob- 
serving usually is followed by reporting or recording that which is 
observed, while seeing is not; fifth, the observer may supplement his 
observation by the use of checklists or inventories ; the seer does not 
deem it necessary to do so. 

Upon the basis of the differences between observing and seeing 
noted above, observation may be defined as purposeful seeing di- 
rected toward the obtaining of useful facts or information about a 
Specific person, Place, event, object, situation, or condition. It is 
evident, then, that observation is not only flexible in its adaptability 
to a wide variety of purposes, but that it demands considerable ex- 


190 


OBSERVATION, ANECDOTES, INTERVIEWS 191 


pertness on the part of the observer if it is to attain its optimum 
value as an evaluative technique. 

In Chapter 2 the various areas and phases of student growth and 
development which must be evaluated by the school, and, more 
specifically, by the teacher, were discussed. The salient points pre- 
sented in that discussion were that— 


1. The individual is evaluated from his preschool through his post- 
school days, 

2. All of the student's activities—curricular, cocurricular, and out of 
school—must be appraised, 

3. Evaluation must provide a total picture of the student, including 
his academic achievement, his mental abilities, his health and phys- 
ical skills, his attitudes, his work habits, his critical thinking 
ability, and his social behavior, and 

4. Evaluation takes place in the classroom, in other school activities, 
in the home, in the community, in the place of employment. 


In short, the teacher must know the student thoroughly with 
respect to every phase and facet of his total being—mental, physical, 
emotional, social, academic, and vocational. He must know him 
(the student) over as long a period of time and in as many different 
areas of activity as possible. Only then can the teacher fully ap- 
preciate and understand the motives, incentives, frustrations, assets, 
liabilities, desires, wishes, and problems that combine to make each 
student a unique individual. Only then can he help the student to 
plan an educational program tailored to his individual dimensions. 
and proportions. T 

The effective teacher is ever alert to detect the unique symptoms 
of behavior that distinguish Sam from Jim, Terry from Toni, Jan 
from Jane. It makes no difference whether it be in the classroom, the 
gymnasium, or the lunchroom, on the front steps of the school 
building, at the movies, the football game, or the local church or 
department store—this teacher is observing. He wants to know how 
each student reacts to different situations; whether he is a leader or 
a follower, a “bookworm” or an all-round good fellow; whether he is 
shy or aggressive; how he reacts to criticism and to praise; whether 
he “digs in" or gives up in the face of adversity ; how well he is liked 
by his fellow students; whether he is active in student affairs or 
prefers to be alone; how mature he is socially; whether or not he 


192 EVALUATING STUDENT PROGRESS 


“blows up” over small disappointments. All these and many other 
forms of behavior he regards as symptoms which will help him to 
understand each boy and girl better. 

Constant attention to the behavior symptoms of each student is 
the key to early identification of special problems and incipient 
maladjustments. Early identification followed by appropriate re- 
medial and therapeutic treatment will help to avoid a great many 
later and more serious difficulties. The school’s first line of defense 
against serious academic and behavioral problems is the alert teacher 
who observes the earliest symptoms of such conditions in his day to 
day contacts with his students. It is obvious that the only practical 
means of detecting many of these students with problems is observa- 
tion. 

The authors are not so naive as to contend that every observed 
symptom is an outgrowth of a single precipitating cause (although 
some may be!) or that each symptom leads directly to a specific 
problem. They do, however, believe that the teacher can be helped to 
do a better job of observing if he has some notion of the kinds of 
behavior symptoms which usually are indicative of somewhat ab- 
normal circumstances or conditions surrounding or affecting the 
student, and which, therefore, bear watching. Torgerson's ! behavior 
inventories, shown in Chapter 9, represent one tool to aid the teacher 
in making his observations more objective. 

Since a great majority, if not all, of the symptoms and conditions 
listed in the inventories are characteristics of the individual student, 
we might be led to believe that observation is entirely useless as a 
group evaluative technique. Such is far from being true, however. 
Much of the effectiveness and value of observation as an evaluative 
technique lies in its adaptability to group evaluation in areas that 
are at present impossible to appraise adequately in any other way. 

As an example, a warm, congenial classroom or group "climate" is 
an essential adjunct to effective teaching and learning. No two 
persons, teacher or administrator, would be likely to define this 
desirable “climate” in the same way; yet it is obvious to any un- 
biased observer that it exists in certain classrooms, not in others. The 
teacher, in appraising the group climate of his class, looks for friend- 


* Theodore L. Torgerson, Studying Children (New York: Dryden Press, 1947), 
p. 45. 


OBSERVATION, ANECDOTES, INTERVIEWS 193 


liness, warmth, congeniality, absence of fear and tension, absence of 
an overcritical spirit, absence of undue competition among class 
members, cooperativeness, acceptance of students by each other, and 
a good rapport between teacher and students. On the other hand, he 
recognizes a poor “climate” if he observes unfriendly and harsh 
criticism of one student or group by another, individual or group 
rivalry that destroys cooperation and fair play, re jection of one indi- 
vidual or group by another, presence of cliques and a too strong 
“we-feeling” among subgroups in the class. In an atmosphere such 
as this, learning is seriously affected and both individual and group 
activities suffer. It is only through intelligent observation that these 
characteristics of the group may be determined and appraised. 

Although every known device, technique, and method should be 
employed to aid the teacher in the evaluation of student growth and 
development, each one has its own peculiar and unique advantages 
and limitations. It is up to the user (the teacher) to weigh these 
advantages and disadvantages and then (1) to select the method or 
tool best fitted to the particular situation at hand, and (2) to accept 
the results with due consideration for all the limitations of the 
method or device. 


Advantages of observation as a method of evaluation 


Observation— 

1. Is easy to use (although not easy to use well!). 

2. Requires no special or additional tools or equipment. 

3. Since no special or unusual conditions exist during observation, 
encourages students to act naturally. 

4. Is adaptable to both individuals and groups, and flexible enough 
to be used to gather data in almost any situation or under any 
conditions or circumstances. 

5. Provides developmental data, i.e., information about day to day 
changes in growth and development. 

6. Makes possible continuous evaluation of the individual. 

7. Is useful and usable with children of all ages. 

8. Is usable with children of any racial or cultural background, since 
language or custom is not a factor in its application. 

9. Can be (should be!) carried on as an integral part of the regular 
teaching function, i.e., no extra time is required to observe. 

10. Reveals the total person—his verbal as well as his bodily mani- 
festations of joy, sorrow, disgust, frustration, and so on. 


194 


EVALUATING STUDENT PROGRESS 


Limitations of observation as an evaluative technique 


However, observation has also some drawbacks, since it— 


iF, 


2 


Reveals only overt behavior, ie., symptoms; reasons or causes 
precipitating or underlying the behavior must still be determined. 
Is cross-sectional in nature; i.e., one can observe only what is 
happening now. If observations are infrequent, this may lead to 
many gaps in the data obtained. 

Is difficult to interpret what is observed, i.e., what is observed 
may lack meaning and significance for the observer. 

May be difficult to record accurately what is observed, thus limit- 
ing the usefulness of the information. 

Depends for its value upon the purpose of the observer, i.e., its 
value is in direct proportion to the intensity of effort applied in 
the observation. Purposeless observation contrasted with purpose- 
ful observation may be likened to a casual walk around the block 
contrasted with a brisk hike as a means of keeping in physical 
condition. 

Is influenced tremendously by the inferences, attitudes, biases, 
and prejudices of the observer. If the observer sees only what he 
wants to see and hears only what he wants to hear, what he 
records may present an entirely false picture or appraisal of the 
situation. 

Is of limited value when the observer has had little or no previous 
training or experience in observation per se, or when he lacks ex- 
perience with, or understanding of, the event or type of behavior 
being observed. Knowing “what to look for" (what is signifi- 
cant), “how to find it," “when to expect it,” and “what it means" 
are highly desirable, if not completely essential, competencies of 
the skilled observer. 


Every teacher should consider it a professional obligation to im- 
prove his skill as an observer, since effective observation is an 
inextricable phase of effective teaching. The observer, the teacher in 

* this case, holds the key to the success or failure of observation as a 
method of evaluation. The personal equation cannot be removed 
from the technique. Therefore, the teacher who would improve his 
observational skill must— 

1. Look at students, events and situations with a purpose. He 
must want to learn something. “How does John react to criticism by 
other students ; by the teacher, or by others?” “Does Jane know how 


OBSERVATION, ANECDOTES, INTERVIEWS 195 


to use an index efficiently?” “Does Sam seem tired and apathetic 
during class?” “Do her classmates seem to resent or reject Betty?” 
“Is Dick improving in his ability to express himself orally?” “Is 
Nancy’s spelling showing any improvement?” “What circumstances 
or conditions seem to create tensions and dissensions in the class?” 

In each of these situations the teacher has a reason for observing 
his students, either as individuals or as a group. If Jane's school 
Work is not as good as it could be, perhaps the reason is that she 
doesn't know how to use an index. By watching Jane as she studies, 
this fact might be evident. Betty's teacher may feel that her bellig- 
erent behavior may be a retaliation for her failure to be accepted 
socially by her classmates. Careful observation may reveal that she 
is actually being rejected by them. Armed with this information, her 
teacher is in a position to develop a program designed to improve 
Betty's relations with her classmates. 

Mr. Fletcher's class seems to be very irresponsible and unmanage- 
able at certain times. *Why just on certain days?" he asks himself. 
He then proceeds to observe his class carefully to try to find out what 
conditions or events, if any, consistently create this behavior. He 
discovers that it occurs at times when (a) he is himself emotionally 
disturbed, or (b) an athletic event or contest is approaching. Such 
information helps him to better understand his class and reduce the 
frequency and intensity of such undesirable behavior. 

In each instance cited, the teacher's observation was purposeful, 
prompted by a desire to solve a problem or improve a condition 
which inhibited learning or development. 

2. Constantly remind himself of two things: (a) I must acquire 
facts from my observation, and (b) I must acquire all the facts I 
can. 

It is obvious, of course, that some necessary information about 
Students and groups will not (cannot) be learned through observa- 
tion. However, the teacher who uses observation for the purpose of 
condemning a student or for merely corroborating a preconceived 
attitude or belief concerning him, not only has an unprofessional 
purpose but is forgetting that conclusions and diagnoses based upon 
Partial or inaccurate information are false and invalid. 

The teacher must continually differentiate what he really sees (or 
hears) from what he thinks he sees or hears; and he must keep his 


196 EVALUATING STUDENT PROGRESS 


eyes, ears, and mind always open to receive additional information 
and be willing to modify or change his conclusions or diagnoses on 
the basis of such new evidence. Miss McKay sees Bill nodding his 
head and fighting (somewhat unsuccessfully) to keep awake in her 
11B English class. Since Bill is not disposed to express great enthusi- 
asm for English literature (a fact which irritates Miss McKay consid- 
erably), she sees Bill’s dozing off as proof positive that he is a “lazy 
lout.” Instead of remembering the fact that Bill “dozed off,” she 
views the incident as further evidence of Bill’s laziness. On all the 
other days, when Bill participates actively in class projects, Miss 
McKay fails to notice his behavior. In this instance, the teacher is 
using observation to search for evidence to confirm her prejudiced 
point of view, and is unwilling to accept any evidence that would 
necessitate the changing of her attitude toward the student. 

Tn order for observation to be of maximum value as an evaluative 
technique, it is, therefore, essential that the observer develop an 
objective, open-minded attitude and be motivated by a sincere desire 
to help the student. 

3. Record observations objectively. There are two principal rea- 
sons for this suggestion. First, observation loses much of its value 
as an appraisal technique if events, behaviors, characteristics, and 
so on, which are observed are not recorded immediately. Most, if 
not all, information gained through observation is not acted upon 
or utilized immediately and must be retained for subsequent use. 
Even though the major aspects of the observation might be remem- 
bered, there are many important details which would be forgotten 
unless they were recorded. Second, when the teacher writes down a 
description of an incident or an observed behavior or characteristic, 
he is much more apt to be objective and factual than when he merely 
carries it around in his memory. People, as a rule, are somewhat 
reluctant to put in writing what they recognize as false, harmful, or 
prejudicial information. Knowing that he will have to record what 
he observes makes the teacher more careful and critical in his 
observation, 

4. Make use of checklists and inventories as guides to direct his 
observation and make it complete and comprehensive. This is espe- 
cially important in the early years of teaching when many new duties 
and responsibilities occupy the teacher’s attention, As pointed out 


OBSERVATION, ANECDOTES, INTERVIEWS 197 


earlier in this chapter, knowing what to look for gives direction to 
the observation and makes it less likely to become mere random 
looking. The inventories presented in Chapter 9 give the teacher 
a suggested list of items to look for in his visual and auditory 
appraisal of students. 

5. Study the behavioral and developmental characteristics of boys 
and girls. Familiarity with the characteristics and types of behavior 
to be expected at various levels and stages of the growth and develop- 
ment of adolescents will enable the teacher to accept certain acts 
unemotionally and with less shock than if every behavior manifes- 
tation is viewed as novel, unique, or unusual. 

6. Verify observed data as often as practicable. By application of 
other measures or evaluative techniques, such as tests (achievement 
and diagnostic), examinations (physical, visual, auditory and psy- 
chological), interviews, sociograms, autobiographies, and question- 
naires, he will be enabled to verify or corroborate his observations. 

It is well to remember that no single measure or evaluation can be 
relied upon to be completely valid and reliable—each one, however, 
adds to, supplements, corroborates (or negates) what the others show. 
Thus, the pieces of the jigsaw puzzle which we term “the indi- 
vidual” are filled in and the total picture becomes clear only when 
we make use of several types of measures and/or evaluations, each 
making its own unique contribution to the whole. 


ANECDOTAL RECORDS 


The usefulness of observation as a technique of evaluation is en- 
hanced if the facts of the observation are recorded. The anecdotal 
record serves this purpose. It may be defined as a factual record of 
an observation of a single, specific, significant incident in the behav- 
ior of a pupil. Thus, it is evident that the anecdotal record has 


certain characteristics : 


1. It is factual, recording only the actual event, incident, or observa- 
tion, uncolored by the feelings, interpretations, or biases of the 
observer. 

2. It is a record of only one incident. 

3. It is a record of an incident which is considered important and 
significant in the growth and/or development of the pupil. 


198 1 EVALUATING STUDENT PROGRESS 


The purpose of the anecdotal record is to present a picture of 
. change, growth, or development of the pupil as represented or por- 
trayed by his observable behavior over a period of time. It is as 
though a series of snapshots were taken in sequence covering a 
period of growth. A graphic display of change would thus be 
presented. 

The question of what behavior to record and what not to record 
is difficult to answer categorically. Basically, the answer hinges upon 
the reason, or reasons, wky the observations are made. The behavior 
which is to be recorded must contribute significantly to the solution 
of the problem which is being investigated. Tt would be very un- 
economical of time, energy, and space to record behavior at random. 
However, if there is any doubt about whether an incident is signifi- 
cant or not, the wise thing to do is to record it first and then discard 
it later if events show it to have little or no relation to the problem 
at hand. It must be borne in mind that writing anecdotes takes time, 
that filing them takes space, and that both operations require energy. 
Tt behooves everyone concerned, therefore—teacher as well as admin- 
istrator—to keep his enthusiasm in check to avoid the danger of 
flooding the files with so many meaningless anecdotes that the really 
significant events are lost or inundated in the deluge. 

It must be remembered that a series of anecdotes is recorded for 
the express purpose of delineating the course of growth and develop- 
ment of a particular student, usually with respect to some aspect or 
phase of his behavior, such as his relations with his peers, his study 
habits, his emotional stability, his habits of courtesy, his degree of 
timidity or aggressiveness, and many more. It is important, there- 
fore, that the incidents that are recorded be uncolored by any a 
priori opinions on the part of the observer, and that anecdotes de- 
Scribe incidents of desirable as well as undesirable behavior. It is 
extremely important that the observer not select his anecdotes, either 
positive or negative, for the purpose of establishing the validity of 
his beliefs or conclusions already formed. If used in this way, the 
anecdotal record is worse than useless. As is true of any tool or - 
Instrument, it is no better than the craftsman who uses it. 

In brief, the observer records facts ; he does not record opinions or 
interpretations. At least he does not record the latter as though they 
were facts. It is perfectly proper and even desirable to include as a 


OBSERVATION, ANECDOTES, INTERVIEWS 199 


part of the record an interpretation of the behavior observed, or 
even recommendations concerning it, if such interpretations and 
recommendations are recorded separately from the incident itself. 
There are several different forms or styles of anecdotal records. 
Basically, however, they contain the following items or parts: 


Identity of the pupil observed—name, school, class 
Date of observation 
Name of observer 
Setting or background of the incident 
Incident 
Signature of observer 
Optional 
Interpretation of behavior 
Recommendations concerning behavior 


The anecdote may be recorded on a card or paper of any con- 
venient size (usually 5" x 7”). Only one anecdote is recorded on each 
card. If several anecdotes are to be recorded, a standard 81" x 11” 
Sheet is used. Several different types of record forms are shown in 
figures 11, 12, and 13. 

The setting, or background, of the incident is a very necessary 
part of the record. It sets the stage for the incident and gives it 
meaning and perspective. Even a highly objective record of an inci- 
dent might be very misleading if reported out of context, as illus- 
trated by the following example: 


Incident: While Frank was reading his lines, George continuously 
made faces at the other cast members, whispered loudly to 
students offstage, and created such a disturbance that Frank 
was forced to stop reading. 


As it stands, this anecdote is rather difficult to understand or inter- 
pret, yet it is a very objective description of the situation. What 
makes it rather meaningless is that it is not complete—the setting is 
missing. The reader does not know what the background of the 
incident is or the nature of the total situation of which this incident 
is a part. Judged on the basis of this incident the teacher reading the 
anecdote would probably picture George as a troublemaker, a bad 
influence and, in general, a plain nuisance. 


200 EVALUATING STUDENT PROGRESS 


ANECDOTAL REcorD FORM 


Name of Student 
School Grade. 
Date______________ Observer. 


Setting 


Incident 


Observer’s Signature. 


(front) 


Interpretation 


Recommendations 


(back) 


Fig. 11. Sample anecdotal record form 


OBSERVATION, ANECDOTES, INTERVIEWS 201 


Student's Name- Grade. 
School 
Observer 
Setting Date. 
Incident 


Observer's Signature. 


Setting Date. 


Incident 


Observer's Signature. 


Setting Date. 


Incident 


Observer's Signature. 


Setting Date. 


Incident 


Observer's Signature. 


Fig. 12. Multiple anecdotal record form 


202 EVALUATING STUDENT PROGRESS 


Student’s Name. Grade. 
School 


Observer 


Date | Setting and Incident | Interpretations and Recommendations 


Fig. 13. Multiple anecdotal record form 


Now let us include with this incident the setting, providing some 
orientation and background to the action described in the incident. 


Setting: George and Frank are bitter rivals in all school activities. 
Both are very good students. Today tryouts were being held 
after school for the junior class play. George and Frank were 
both on stage prepared to read their lines for the lead part 
in the play, a part both boys wanted very much. I told them 
x would select the lead on the basis of their reading of the 

ines. 


{ 
| 


OBSERVATION, ANECDOTES, INTERVIEWS j 203 


With this additional information and background, George’s be- 
havior is much more understandable. We now have a far different 
impression of him than we had before. With additional anecdotes 
and other facts about George’s background, we would now be able 
to make an intelligent interpretation of this incident and probably 
also make one or more positive recommendations concerning proper 
procedures to guide his future development. 

The anecdotal record is useful to the teacher, the counselor, the 
administrator, in fact, to anyone concerned with evaluating and 
guiding the development of boys and girls. In order to attain its 
maximum usefulness, however, it must fulfill two basic requirements 
or criteria: 


1. It must show clearly what growth and change have taken place. 
2. It must require a minimum expenditure of time and energy. 


What do these two requirements imply for the anecdotal record? 
Fundamentally this—that the accumulated anecdotes be summarized 
periodically and that this summary replace the individual anecdotes 
in the pupil’s file or cumulative record folder. In such a summary the 
salient features of the pupil’s growth and his present status are 
recorded. Thus, it is fairly simple for anyone—teacher or other 
interested persons—to obtain in a moment the developmental picture 
that would otherwise require much time and effort. 


Limitations and cautions in preparing and using anecdotal records 


No tool or technique is perfect. The anecdotal record is no excep- 
tion. Several disadvantages and limitations have already been men- 
tioned. A more thorough discussion of all the shortcomings of the 
anecdotal technique is presented here. 


1. The record may be based upon inaccurate observation. The tend- 
ency for people to see or hear what they want to see or hear, or 
what they are prepared to see or hear, is well known, especially 
in courts of law, where two parties observing the same incident 
will report entirely different versions of the affair. 

2. The incident may not be recorded objectively. Many individuals 
find it extremely difficult to write impartially and to present only 
cold, hard facts. They embellish their remarks with highly 
opinionated and emotion-packed words and phrases, such as 


204 EVALUATING STUDENT PROGRESS 


“without a doubt,” “good for nothing,” “highly questionable mo- 
tive,” “mean,” “lazy,” “impertinent,” “naughty,” and many 
others that are far from objective and are designed to influence 
the reader rather than to present him with information. 

3. The observer may be prompted by a desire to justify an action, 
an opinion, or a conclusion. He may use the anecdote as a means 
of rationalizing his own behavior. 

4. The incident may lose its meaning or significance when reported 
out of context or in isolation from the day by day behavior of the 
student. 

5. Inadequate sampling of behavior may lead one to unfair and in- 
valid conclusions about a student. A summary of too few anec- 
dotes presents the danger of being accepted by the unsuspecting 
or uninformed teacher as valid evidence of the student’s behavior 
or personality. 

6. There is the possibility that in summarizing anecdotes, incorrect 
conclusions, and/or generalizations may be made. Only well- 
qualified, trained personnel should be given the responsibility of 
summarizing. 

7. Since writing, filing, and summarizing anecdotal records does in- 
crease the load on teachers, counselors, and clerical staff, there is 
a danger that the whole anecdotal record program will fail unless 
a definite, workable plan is set up in advance to handle such 
details. 

8. There is a tendency for teachers to record too many anecdotes of 
undesirable behavior and too few of desirable behavior, thus pre- 
senting a one-sided picture of the student’s development. This 
limitation can usually be avoided or minimized by in-service 
teacher education programs designed to develop thorough under- 
standing of the purpose and use of anecdotal records, 

9. Sometimes teachers, in an effort to be fair and not record too 
many negative incidents, will overlook or fail to record incidents 
that are significant in order to present a “balanced” picture of the 
student. 

10, Anecdotes, individually or in summary form, present only a pic- 
ture of status—they do not reveal causes. 


Advantages and values of anecdotal records 


In the previous section some of the limitations and cautions to be 
observed in using anecdotal records were described, Traxler reported 
an excellent summary of the various uses and values of anecdotal 


OBSERVATION, ANECDOTES, INTERVIEWS 205 


records as gathered from published articles on the subject by a 
number of writers: 


il 


11. 


12. 


Anecdotal records provide a variety of descriptions concerning 
the unconstrained behavior of pupils in diverse situations and 
thus contribute to an understanding of the core or basic person- 
ality pattern of each individual and of the changes in pattern. 
"They substitute specific and exact descriptions of personality for 
vague generalizations. 

They direct the attention of teachers away from subject matter 
and class groups and toward individual pupils. 

They stimulate teachers to use records and to contribute to them. 
They relieve individual teachers of the responsibility of making 
trait ratings, and provide a basis for composite ratings. Moreover, 
they provide a continuous record, whereas trait ratings are usually 
made only at certain points in a pupil's school experience. 

They encourage teacher interest in, and understanding of, the 
larger school problems that are indicated by an accumulation of 
anecdotes. 

They provide the information which the counselor needs to con- 
trol the conferences with individual pupils. An appropriate start- 
ing point for each conference can be found in the data, and the 
discussion can be kept close to the pupil's needs. 

They provide data for pupils to use in self-appraisal. Whereas in 
some cases the anecdotes should not be shown to the pupils, each 
pupil can profitably study the indications in many of the anec- 
dotes about him in order to decide what he needs to do to im- 
prove. 

Personal relationships between the pupil and the counselor are 
improved by these records, for they show the pupil that the coun- 
selor is acquainted with his problems. 

Anecdotal records aid in the formulation of individual help pro- 
grams and encourage active pupil participation in remedial work. 
They show needs for the formation of better work and study 
habits and also provide encouraging evidence of growth in these 
respects. 

Curriculum construction, modification, and emphasis may be im- 
proved through reference to the whole volume of anecdotal record 
material collected by a school. The anecdotes indicate where there 
should be general presentation of material in character develop- 
ment to satisfy the needs of the whole school community. 


206 EVALUATING STUDENT PROGRESS 


13. An appropriate summary of anecdotes is valuable for forwarding 
with a pupil when he is promoted to another school. 

14. Anecdotal records may be used by new members of the staff in 
acquainting themselves with the student body. 

15. The qualitative statements contained in these records supplement * 
and assist in the interpretation of quantitative data.” 


To this list the authors should like to add three additional, and in 
their opinion, exceedingly important values: the anecdotal method 
contributes greatly to the teacher’s understanding of human behavior 
and to his ability to see behavior in its total context; it helps him 
to acquire skill in identifying causes; and it vitalizes teaching by 
making him consciously aware that he is dealing with dynamic 
human beings rather than with inanimate books and subject matter. 


INTERVIEWS 


Next to observation, the interview is probably the most adaptable 
and useful informal evaluative device at the disposal of the teacher. 
It is unfortunate that, on the whole, it is rarely thought of in this 
connection. Many teachers seem to regard interviewing as something 
reserved for psychoanalysts, psychiatrists, employment officials, or 
professional counselors of all kinds. Yet the teacher uses the inter- 
view daily in relations with students, parents, faculty members, 
administrators, and just general acquaintances. In each case some- 
thing new is generally learned about the Person with whom the 
discussion was carried on, or about some other individual who might 
have been the subject of the discussion—his attitudes, his interests, 
his feelings, his hopes or ambitions, his likes and/or dislikes, his 
problems, and his accomplishments. In each case, also, the teacher 
appraised or gave some value, great or small, to what the person said 
or the general impression created by his behavior and mannerisms. 

There are many different types of interviews, each with its own 
unique emphasis or purpose, but basically all types have certain 
common characteristics. In every case, the interview is an exchange 
of ideas between two Persons in a face to face relationship carried 
on for a purpose and constructed, or guided, in some degree, by one 
of the parties involved. 


* Arthur Traxler, “The Nature and Use of Anecdotal Records” (New York: Edu- 
cation Records Bureau, 1939), pp. 26-29. 


OBSERVATION, ANECDOTES, INTERVIEWS 207 


The differences which do exist between the various types of inter- 
views are, therefore, chiefly in the methodology employed by the 
interviewer in achieving the purpose for which the interview is being 
conducted. The employment interviewer follows a set pattern or 
procedure, asking each prospective employee essentially the same 
questions so that he may compare and evaluate their qualifications. 
Little, if any, attention is given to the applicant’s feelings or emo- 
tions. On the other hand, the psychiatric interview is largely unstruc- 
tured and proceeds with a minimum of probing or questioning by the 
psychiatrist (interviewer). The emphasis here is on the client’s 
(interviewee’s) feelings and his emotional status. The purpose in 
this type of interview is to help the client achieve a better emotional 
balance and a clearer understanding of himself and his behavior. 
There are also fact-finding (census survey) interviews, information- 
giving (employment service) interviews, therapeutic interviews, exit- 
interviews (termination of employment or school association), and 
many other variations. 

It should be noted in the two interviews described above, employ- 
ment and psychiatric, that the procedures or techniques employed 
are vastly different. In the case of the employment interview, the 
interviewer is very forthright and direct in his approach. The inter- 
view follows a prescribed and prearranged pattern and the inter- 
viewer is aware from the start of the exact course he wishes to 
pursue during the interview. In the psychiatric interview, on the 
other hand, the interviewer does not attempt to dictate the course of 
the interview but attempts to motivate the client to explore the 
personal and environmental circumstances surrounding his problem. 
The first interview might be described as directive, the second as 
nondirective, or permissive. 

There is yet another type of interview, which might be described 
as eclectic, which makes use of directive or nondirective tactics, 
according to the nature of the interviewee, the topic or purpose of 
the interview, and the judgment of the interviewer. The authors feel 
that this is the approach to interviewing that is most practical for 
the teacher or teacher-counselor. Many interviews in the school sit- 
uation are carried on for the purpose of giving or receiving informa- 
tion of a factual nature where no emotional factors are involved. 
Such an interview can be quite directive. On the other hand, in situa- 


208 EVALUATING STUDENT PROGRESS 


tions where high feeling or emotion is involved, the teacher must be 
careful not to interject himself or his ideas too forcefully into the 
interview, because to do so might jeopardize his entire relationship 
with the student and make future counseling almost impossible. 
In such cases the nondirective approach is preferable. Whether the 
interview is of the directive or nondirective type, however, it can 
provide the teacher or counselor with a great deal of evaluative data 
impossible to obtain any other way. 

The interview is a time-consuming technique and should be used 
only to gather data that cannot be obtained in any other way or by 
any more efficient means. It must be re-emphasized, however, that 
there are many cases where the interview must be used, since no 
other known technique or device will enable the teacher or counselor 
to obtain the information he needs or desires. How else, for example, 
could the teacher obtain an appraisal of the student's feelings ex- 
pressed in his facial and bodily movements, his mannerisms, his 
vocal inflections, and/or his enunciation? How else can the teacher 
directly appraise the changes in the student's attitudes and beliefs? 
How else can he fully appreciate and understand the dynamic nature 
of the individual? 

The interview, as employed by the teacher, can be said to contrib- 
ute to the over-all evaluation program of the school in four important 
Ways: 


l. It supplements and adds to the factual data known about the 
student. 


2. m provides a check on the accuracy and validity of accumulated 
ata. 


3. It allows the teacher to observe the student in a controlled situ- 
ation for physical evidences of nervousness, emotional abnormal- 
ities, physical disabilities, and poor health. 

4. It provides verbal clues of the presence of conflicts, frustrations, 
prejudices, hates, and fears. 


The value of the interview, as an evaluative technique, is in direct 
proportion to (1) the confidence which the student has in the teach- 
er’s integrity and purpose; (2) the rapport which the teacher is able 
to establish with the student; and (3) the value which the student 
himself sees in the interview. Unless the student has enough confi- 


OBSERVATION, ANECDOTES, INTERVIEWS 209 


dence in the teacher to speak freely during the interview, there is 
little, if any, value which accrues from it. 

Learning to be a good interviewer cannot be accomplished in “one 
easy lesson.” The following suggestions probably won’t make the 
beginning teacher or interviewer an expert overnight, but they might 
help him to get over some of the rough spots. 


ils 


Make the student feel at ease by greeting him cordially and 
warmly; provide him with a comfortable chair in a pleasant 
room; talk about a topic of interest to him to “break the ice” and 
start the conversation. A review of the student’s cumulative 
record in advance of the interview may provide suitable subjects 
to talk about. 

Don’t monopolize the conversation. Once the student has started 
to talk, let him take the lead in bringing new topics into the dis- 
cussion. 

Be a good listener; don’t be surprised or irritated by anything 
the student says; show real interest in everything he says; let him 
know by your attitude and your remarks that you are deeply con- 
cerned about him and his problem. Genuineness and sincerity are 
two traits that will make up for many other shortcomings the 
interviewer may have. 

Don’t use the interview as an opportunity to lecture, preach, 
moralize, or indoctrinate the student with personal pet beliefs 
and opinions. Certainly don’t be sarcastic or belittle him. 

Don’t betray a student’s confidence by discussing the interview 
with other teachers or acquaintances—not even his parents— 
unless you have his express consent to do so. 

Don’t ask leading questions which force the student to give an- 
swers he really doesn’t believe and don’t put words into his mouth 
by suggesting answers to questions. 

Don’t be too gullible, don’t “swallow” everything the student 
says, at least not until you have made sure that he is being honest 
with you. This may take several interviews to ascertain. Many 
times students deliberately try to mislead the teacher or counselor 
by giving him inaccurate or incomplete information to “try him 
out” or test him before they get around to divulging their real 
problem. Make every effort to discriminate between fact and 
fiction. 

Don’t become so interested in the student’s story that you are 
carried away emotionally; remain in charge of the interview, 


` 210 


10. 


11. 


12. 


EVALUATING STUDENT PROGRESS 


keep it moving, but don't dominate it. This is one of the most 
difficult techniques of interviewing to master. 

Don't think of yourself as an amateur psychiatrist and try to 
read hidden meanings into every word the student says. Some of 
his remarks are mere "filler" and others may mean just what they 
appear to mean! 

Make the student feel glad that he came to talk to you by point- 
ing out (or better still, by having him point out) what has been 
accomplished by the interview. À 
Let him know that he is always welcome to come back to see you. 
Tf the interview has been carried on successfully, he will no doubt 
want to come back, but a direct suggestion or invitation might 
not be amiss to give him added assurance. 

Record the salient facts gained from the interview immediately 
after its conclusion. Do this accurately, objectively, and in as 
brief a manner as is compatible with completeness. Don’t take 
notes during the interview unless the student gives you complete 
assurance that he doesn’t mind. 


As with every evaluative technique, the interview has some advan- 
tages and some limitations. Among its important advantages are 


these: 


I 


It is as natural as conversation itself. Communication is its only 
handicap, thus making it impractical for use only with very’ 
young children or persons with verbal disabilities or speech de- 
ficiencies. 

It is adaptable to a variety of purposes, problems, situations, and 
conditions. Its flexibility is one of its greatest virtues. 

It can reveal actual causes underlying behaviors or problem situ- 
ations. 

It tends to reveal the whole or total person as a dynamic, ever- 
changing being—his feelings, emotions, attitudes, ideas, beliefs, 
condition of health, hopes, and desires. 

It may be diagnostic, fact-finding, and therapeutic at one and the 
same time. 

It provides developmental data showing when and how the prob- 
lem started and the stages or phases through which it passed. 
It enables the teacher to enlist the aid of the student in solving 
his own problem or planning his own future actions, 

It is relatively easy to employ in its simple form. 


OBSERVATION, ANECDOTES, INTERVIEWS 211 
Its principal limitations are that— 


1. It is almost completely subjective—data obtained may be based 
upon misconceptions and/or be colored by attitudes, prejudices 
and biases. 

2. It may provide unreliable data—facts may be difficult or impos- 
sible to verify. 

3. It is often difficult to interpret the results of the interview; vari- 
ous facts do not all carry the same weight—some that seem rela- 
tively insignificant may actually be the most important. 

4. It may be misleading by throwing a “smokescreen” over the real 
issues or problems and setting the teacher off on a “wild-goose 
chase” attempting to cope with a problem or situation which is 
actually very minor in character compared to the real difficulty. 

5. The student may not really know himself and, therefore, he may 
make statements which, although honestly spoken, are not true. 

6. It places the student in a somewhat artificial situation; there- 
fore, the behavior which is observed during the interview may 
not be completely characteristic of the individual in other condi- 
tions or situations. 


Observation and the interview are highly subjective in character 
and consequently reflect to a considerable degree the philosophy and 
skill of the teacher. Both techniques can, however, be improved and 
made more objective through study, training, practice, utilization of 
certain aids, and a desire for professional self-enhancement or 
growth. 

The interview method of student evaluation is an extension of 
certain aspects of the observational technique. It brings observation 
into a more formal, structured setting, and enables the teacher to 
search for causes, rather than merely to observe symptoms, thereby 
augmenting the cross-sectional data gained through observation with 
additional longitudinal data. 

Emphasis should be given again at this point to the fact that a 
comprehensive evaluation of a student cannot usually be made 
through the application of subjective methods alone. Therefore, it is 
necessary in most cases to supplement observations and interviews 
with other techniques and measurements, including both tests and 
self-reporting devices. Since the data obtained from subjective 
methods are not in the same units (and therefore not directly com- 
Parable) as those obtained from the more precise and quantitative 


212 EVALUATING STUDENT PROGRESS 


objective measures, there is a considerable problem involved in 
synthesizing, analyzing, and interpreting the data obtained through 
the use of the different methods. The discussion of the case study in 
Chapter 12 will attempt to clarify this situation and present some 
suggestions for integrating and coordinating data as a basis for an 
over-all evaluation of the growth and development of the student. 


CHAPTER 
I 


Using Sociometrics, Sociodrama, 
Autobiography, and Other Informal 
Techniques 


In tHe Turrty-FourtH YramBook of the National Society for the 
Study of Education, Brueckner stated, “The function of the school is 
to provide a carefully guided series of learning activities that will 
insure the achievement of those objectives of education that are 
accepted as valid and worth-while. These objectives should not be 
thought of narrowly in terms of specific knowledge, skills and abili- 
ties, but broadly so as to include attitudes, appreciations, power, 
purposes, and controls. . . ."! 

To these objectives, broadly stated by Brueckner, have now been 
added other desired outcomes, especially that of developing the abil- 
ity to live in harmony and understanding with one another, generally 
referred to as human relations. This involves “social” learning— 
Social relationships and patterns of behavior and belonging, feelings 
and concerns about relationships with families and peers. The ele- 
ments and qualities involved in human relationships are so intangible 
and nondefined for the most part that no objective instruments have 
yet been devised to measure them quantitatively with any degree of 

! Educational Diagnosis, Thirty-Fourth Yearbook of the National Society for the 
ze of Education (Bloomington, Illinois: Public School Publishing Co., 1935), 

213 


214 EVALUATING STUDENT PROGRESS 


validity. However, two very promising informal methods have been 
devised to obtain evidence of the students’ development of the quali- 
ties and behaviors essential in satisfactory human relationships, 
namely, sociometry and the sociodrama. 


SOCIOMETRY 


Sociometry is essentially a method for determining social group 
structure or friendship patterns within a class or group of students. 
As originally devised by Moreno and subsequently applied to the 
problems of evaluation in the classroom by Jennings, Cunningham, 
Taba, and others, sociometry enables the teacher to chart the 
dynamic relationships expressed by members of a group at any 
given time? This is accomplished by asking the students to respond 
to certain questions regarding their preference of associates for 
certain real or concrete situations or activities. Although the ques- 
tions asked by the teacher must “fit” the particular situation, they 
usually take such forms as— 


“Whom would you most like to have sit next to you in homeroom?” 

“Whom would you most like to have as a member of a committee 
with you to plan the next class party?” 

“Whom would you most like to have join you on a visit to the city 
council for an interview with the mayor?” 

à een would you most like to invite to your home for a week-end 

visit 

“Whom would you most like to have as a laboratory partner in 
biology?” 

“Whom would you most like to have go to the library with you to find 
material for a science report?” 

“Which three people would you most like to have as members of a 
nominating committee for class officers?” 


After the responses are obtained for any one of these questions, the 
choices are diagramed. This diagram of student preferences is called 
a sociogram and is illustrated in Figure 15 (p. 220). 

It is very important that certain rules be observed in the wording 
and administration of the sociometric test (it is not really a test in 


* Helen Hall Jennings, “Sociometric Grouping in Relation to Child Develop- 
ment, Fostering Mental Health in Our Schools, Association for Supervision and 
erp (Washington, D.C.: National Education Association, 

» p. 205, 


INFORMAL TECHNIQUES 215 


the usual sense, inasmuch as there are no right or wrong answers) 
if it is to be of maximum benefit to the teacher and the student: 


1. The situation should be real for the choosing; choices are not 
hypothetical; they are made for an actual situation in the same 
terms as the action is going to be. 

2. The test is not an end in itself; its results are always put into 
effect to change the arrangements for working or living in accord- 
ance with the choices; sociometric arrangement is only setting the 
stage for a better group work situation. 

3. There is an immediacy to the choosing; it is for right now, to- 
morrow or next week, not some vague time in the future or two 
months later.? 


In order to achieve the most valid results, the sociometric ques- 
tion must be worded so that it motivates the student to give an 
honest answer by indicating precisely what the choices are for, why 
he is being asked to make the choices, when the choices will be acted 
upon (put into effect), and how long the groupings arrived at from 
the choices will be allowed to remain in effect before a regrouping 
is made on the basis of new choices. The following examples will 
serve to illustrate how a sociometric question should be put to a 
group. The first question deals with the seating of students in a 
homeroom: 


You are seated now as you happened to get seated in our homeroom, 
but now that we all know one another, every pupil should have the op- 
portunity to sit near the other pupils he most wants to sit beside. Then 
the classroom can be arranged to suit everyone. Write your own name 
and under it three choices of pupils you would like to sit near in this 
room. Put a “1” next to your first choice, a “2” for your second, and a 
“3” for your third choice. I will try to fit in as many of everyone’s 
choices as possible. But since there are many pupils and each of you 
may be choosing in many different ways, you can see how it is that I 
can only do my best to arrange the seats so everyone gets at least one 
choice, and more only if I can figure the seats out that way.* 


The second question is involved with the problem of grouping for 
committee work : 


Each of you knows best whom you would enjoy being with in the 
same grouping for committee work in social studies for the times we will 


? Ibid., p. 205. * Ibid., p. 204. 


216 EVALUATING STUDENT PROGRESS 


be working together. No one can know this as well as yourself. We shall 
be arranging our new schedule for groups next Monday. Today is Fri- 
day, and I can figure out the membership committees by Monday if you 
would like to choose associates today. We will stay with the same people 
we choose today for eight weeks, and then we will have a chance to 
choose again. Keep in mind all the boys and girls you have come to 
know, whether they are here today or not. Let’s give three choices, or 
four if you like. Wherever possible I'll arrange the groups so that the 
individual gets all his choices. But it is very difficult to give all people 
all their choices because lots of people might choose one person. All of 
them are just as important as this one person.5 


It should be noted from the examples given that only positive 
choices or reactions are called for. Although there may be times 
when it is desirable to ask students to express their dislikes for 
certain persons in the class or group, it is generally desirable not to 
emphasize negative feelings. This practice (asking only for positive 
choices) will tend to eliminate or reduce the objection of many 
teachers to the use of the sociometric technique on the grounds that 
it tends to solidify or magnify cleavages within the group. 

If rejections are wanted, however, the question must be put tact- 
fully and in keeping with the situation, with no implication of one 
student judging another. Such a question might take the following 
form: 


Each of you also knows if there are any people with whom you feel 
particularly uncomfortable in the situation we are choosing for, or who 
may feel this way about you, where a feeling of uneasiness or annoyance 
between them and you may come up in the situation. So I can arrange 
our grouping to avoid this, if there are any people about whom you feel 
this way, or any people who you think feel this way about you, put their 
names at the bottom of the paper. If there aren't, leave it blank.® 


The results of sociometric studies show that students tend to fall 
into a number of categories based upon their status and relationship 
within the class group. In any sociogram there may be— 


1. Jsolates—students not chosen by anyone in the group. 
2. Pairs (or mutual choices)—pairs of students who choose each 
other. 


5 Ibid., p. 205. ? Ibid., p. 206. 


INFORMAL TECHNIQUES 217 


3. Chains—student A chooses student B, student B chooses student 


C, C chooses D, and so on. 

4. Islands—pairs or small groups not chosen by anyone in other 
groups or patterns. 

5. Triangles (or circles)—a closed chain of three or more students in 
which A chooses B, B chooses C, C chooses A, or C chooses D, 
who in turn chooses A. 

6. Stars (or leaders)—a student chosen by a considerable number of 


persons in the group. 


Constructing the sociogram 
Suppose the teacher wishes to obtain sociometric data on the basis 
of the question: “Which three people in this class would you most 


_ prefer to have work with you on the magazine subscription drive?” 


The easiest way of obtain the information is to give each student a 
card or slip of paper large enough for his name and the names of the 
three choices (a 3" x 5" card is usually adequate), like this: 


The chooser's name is written at the lower right and the choices are 
listed above. 

When all the cards are collected, it is a simple matter to tabulate 
' them on a form such as is illustrated in Figure 14. Bill's choices are 
Checked off by placing an X (or a dot) in the appropriate square on 
the horizontal line following his name. After all choices have been 


218 EVALUATING STUDENT PROGRESS 


tabulated, the totals may be entered on the bottom line to indicate 
how often each student was chosen. 

The final step in organizing the sociometric data is the actual 
drawing of the sociogram. Any size sheet of paper may be used for 
this purpose, depending upon the number of students to be charted, 
storage facilities, and other practical considerations. A 814” x 1112 
sheet of rather heavy paper is a convenient size and type for most 


Fig. 14. Tabulation form for sociometric data 


groups. Students are indicated on the Sociogram by using symbols, 
such as circles, triangles, or squares, one symbol for boys, another 
for girls. It simplifies the construction and reading of the sociogram 
if the “stars” or “leaders” of the group are located near the center 
of the chart, the "isolates" on the outer edge and other students in 
between. 

There are several different methods by which the choice lines may 
be indicated on the sociogram. These are illustrated below. 


Ann ———— > Sue One-way choice—Ann chooses Sue. Sue does 
not choose Ann. 


Bob WE Bill Bob and Bill choose each other. 
T 


INFORMAL TECHNIQUES ) — 219 
Bob ————— Bill 
c —X 


or 


Bob < > Bill 


Jim ————— Jo Jim chooses Jo as first choice. Jo chooses Jim 
er S as second choice. 


Jen chooses Terry as first choice. Terry 
chooses Jen as third choice. 


Jen ————— Terry 
eee 


It is also possible to use lines of different colors to indicate the 
various choices, for example, red for first choice, blue for second, 
green for third, and so on. Rejects may also be indicated by colored 
lines, 


Reading the sociogram 

Presented in Figure 15 is a sociogram of a junior high class based 
on the responses to the question, “If you could choose anyone in your 
home room to sit next to you, whom would you choose?” 

In the sociogram shown in Figure 15, student A is a “star,” or 
“leader,” i.e., the most selected individual in the group. Student B is 
an “isolate,” not chosen by anyone. Students M, N, O, P in the 
upper left of the page form an "island," "clique," or “cleavage.” A 
"chain" is illustrated by students G, H, I, J, K, L, A, who are linked , 
together by their choices. Students G, H, and I form an isolate 
"clique," since they are not chosen by anyone in the group. 

The teacher confronted with the situation portrayed by this socio- 
gram would face the necessity of breaking up the clique and drawing 
the isolate and the isolate clique into the group. Thus, the sociogram 
Portrays a situation; it does not solve any problems or show any 
causes. 


220 EVALUATING STUDENT PROGRESS 


Fig. 15. Sociogram of a junior high group 


Other uses of the sociogram 


In investigating the interpersonal relationships existing among 
members of a class, the teacher frequently wishes to determine how 
certain specific factors affect the group structure and the preferences 
of members for each other, A number of such factors has been sug- 
gested by the Horace Mann-Lincoln Institute of School Experimen- 
tation of Teachers College, Columbia University, in a pamphlet 
entitled “How to Construct a Sociogram”: 


1, Studying race and nationality background in relation to group 
structure, Material is plotted in the way described, except that the 
additional factor is plotted for each individual, If one were exam- 
ining the group relationship of Negroes, Anglo-Whites, Spanish- 
Americans and Oriental-Americans, a code such as the following 


INFORMAL TECHNIQUES 221 


(to replace the circles for girls and the squares for boys) might be 
used for plotting: 


Negro A Spanish-American © 
Anglo-White [] Oriental-American 9 


2. Studying age and/or maturity in relation to group structure. In 
an investigation of grouping, a promotion policy, age of admit- 
tance to school, and so on, it may be desirable to know the group 
relations of individuals of various ages and maturity. Under each 
plotted circle or square may be noted the age of the individual and 
a general indication of maturity, such as, mature for age (+), 
average (0), or immature (—). 


Om AC) 


30-6 (0) eG) 

3. Studying relation of group structure to members in other groups. 
The investigator may wish to know the relation of the total group 
structure to out-of-classroom groupings. Does belonging to a 
scout troop, a sorority, the safety patrol, or similar groups (volun- 
tary, appointive, or elective) tend to be associated with more or 
less acceptance of the individual by the group? Of the membership 
group by the group as a whole? If such membership tends to be 
associated with leadership, is it status leadership or operational 
leadership? In making such a study, the person holding member- 
ship in the other group may be designated by a different color, e.g., 
red circle indicating girl who holds sorority membership, and blue 
circle indicating nonmembership. 

4. Studying influence of certain experiences. A teacher may wish to 
know the influence of various techniques over a period of time. 
For example, he may want to know whether constituting com- 
mittees by lot or on the basis of voluntary membership tends to 
integrate the group more successfully. In this case a “before and 
after” application of the sociometric device is called for. He may 
make a sociogram in November, use the voluntary method for 
three months; make another sociogram in February, use selection 
by lot for three months; and make another sociogram in May. A 
study of the three sociograms may give indications of an answer 
to this question,” 

7 “How to Construct a Sociogram,” Horace Mann-Lincoln Institute of School Ex- 

Perimentation, Teachers College, Columbia University (New York: Bureau of Pub- 
lications, Teachers’ College, Columbia University, 1949), pp. 10-11. 


222 EVALUATING STUDENT PROGRESS 


These represent but a few examples of the many ways in which 
the sociogram may be employed. The teacher should feel free to 
experiment to determine the effect of factors on the structure of the 
group, such as place of residence (urban, rural), occupation of father, 
home address (area of city), church affiliation, socio-economic status, 
and so on. 


Some general considerations concerning the use of the 
sociometric method 


1. Choices are valid only for the specific situation described in the 
question. It is dangerous to generalize and assume that students’ 
choices will be the same in all situations and for all purposes. 

2. Although person-to-person relationships are dynamic and never 
completely static, they do not change radically or rapidly among 
older children. It is important, therefore, that sociometric choices are 
not requested for the same purpose so frequently as to be regarded as 
ridiculous by the group members. Rather, the sociometric test should 
meet the felt needs of the members and be employed only in situa- 
tions where and when choices appear logical. 

3. The whole sociometric procedure must be kept as casual and 
natural as possible in order to maintain high group morale and good 
teacher-student rapport, and eliminate resistance to the questions. 
Tf this is not done, the validity of the responses is likely to be so low 
as to make the sociogram completely worthless. 

4. Sociometry is only a method of locating conditions in the class- 
room which need further study, both with regard to the total group 
and subgroups, including the individuals which constitute such 
groups. The sociogram, in other words, does not give final answers; it 
merely opens the way for further study by other devices and tech- 
niques, 

5. Do not use the sociometric question until students have had 
adequate time to become thoroughly acquainted with each other. 
This period varies with the size of the group, the number of minutes 
or hours the members spend together each day, the nature of their 
activities during this period, and so forth. 

6. Sociometry is a relatively new and largely untried method of 
student study, There is Jittle valid or conclusive evidence that it can 
or does provide information or facts not already available by other 


INFORMAL TECHNIQUES 223 


methods, such as observation, anecdotal materials, personality tests, 
rating scales, and the like. It is important, therefore, that the appli- 
cation of sociometry to the problems of human relationship be 
approached in a highly professional manner and that an accurate and 
unbiased appraisal be made of the results. Only then can the teacher 
use his increased knowledge to provide better learning and living 
experiences for the boys and girls in his care. 


SOCIODRAMA (ROLE-PLAYING) 


Playing roles is a natural phase of the daily lives of every child, 
adolescent, and adult. The four-year-old, during the course of the 
day, plays the role of the “good little kid" when he does what his 
mother tells him to do because he knows that then he will be re- 
warded with some candy ; he plays the role of the “leader” when he 
directs the activity of his playmates; and the role of the “follower” 
when he “goes along" with the suggestion of his chums that it would 
be lots of fun to pick all the flowers in the neighbor's yard. 

The adolescent plays the role of “student” when he applies himself 
diligently to the preparation of a research or term paper which has 
been assigned for his American history class ; he assumes the role of 
the sophisticated “man about town” when he escorts his girlfriend 
to the junior prom; he takes on the role of the “hero” when he scores 
the winning field goal in the final seconds of the basketball game; 
and of the great “dramatic actor” when he recites his lines in the 
school play. 

In like manner, the adult plays many roles in normal life. The 
father plays one role at the office, another when he is driving his 
car, another when he is at home with his family, and still another 
when he is at his club or business association. The mother, too, 
assumes many roles—that of mother, clubwoman, housewife, shopper, 
and P.T.A. worker. 

In each role, the individual's behavior is influenced by his heredity 
(physical and mental), experience and learning (attitudes, skills, 
knowledge), the immediate situation (including relationships with 
other persons), and the anticipated, expected, or hoped-for conse- 
quences or outcomes of the behavior (based upon the individual's 
needs, goals, and ambitions). The sum and substance of all these 
factors is that each individual learns to play one role in one situa- 


224 EVALUATING STUDENT PROGRESS 


tion, another role in another situation. There also are many roles 
which each individual would like to play in life, but which for vari- 
ous reasons he never actually accomplishes. Each person, in addition, 
develops attitudes and feelings (some highly emotional) about the 
manner in which others play their roles in life. It is this complex 
combination of actual, personal roles, wished-for roles, and feelings 
about the roles of others that in certain cases creates conflicts and 
attitudes which interfere with personal adjustment and development 
as well as with the attainment of satisfactory interpersonal, human 
relationships. 

Sociodrama, like sociometrics, is a technique to help the teacher 
and the students acquire a better understanding of human rela- 
tionships and personal problems through role-playing. It is “an 
intensive, vivid, living trough of experiences of common concern to 
the group members—experiences which may have been cut short in 
life and blocked írom full expression, leaving unresolved, buried 
emotional impact.” 8 

Although sociodrama is useful for evaluation, diagnosis, education, 
and therapy, its inclusion in this book is based chiefly upon its value 
as a diagnostic and evaluative tool. Through its use both the teacher 
and the student benefit. It enables the teacher to— 


1. Obtain clues to causes of problem behavior. 

2. Secure information concerning home and family relationships and 
conditions, 

‘oe Pith evidences of Previously unexpressed values, goals, and 
needs, 

4. Acquire information concerning students? attitudes, beliefs, fears, 
hates, likes, dislikes, and prejudices, as well as their origins. 

5. Check on information and validate hypotheses and/or conclusions 
arrived at through other techniques. 


It enables the student to— 


1. Learn to appreciate the “other fellow’s” Position—his problems 
and his conflicts. 


2. Develop greater personal insight and a more realistic self-concept. 
*Helen Hall Jennings, “Sociodrama as Educative Process,” Fostering Mental 


Health in Our Schools, 1950 Yearbook. Association for Supervisi i 
ls, 2 pervision and Curriculum 
Development (Washington, D.C.: National Education Association, 1950), p. 260. 


E 
1 


INFORMAL TECHNIQUES 225 


3. Develop a deeper understanding of behavior—its causes and its 
consequences, both with regard to the individual and the group. 

4. See the inadequacies in certain kinds of behavior of others. 

5. Experiment with several different kinds of behavior in a given situ- 
ation and to experience the outcomes in a make-believe situation 
without the serious consequences that might be attached to such 
behavior in real life. 

6. Express feelings and emotions that he could not express in any 
other way, thereby releasing tensions and resolving conflicts. 


The sociodrama must be directed (by the teacher) so that it pro- 
ceeds more or less naturally and spontaneously from the selection of 
a problem situation to the final evaluation of the performance by 
the spectators (and participants). In general, the procedure, although 
not as clear-cut and sharply defined as this discussion might make 
it appear, includes the following steps: 

1. A “warm up,” during which the teacher motivates the group to 
the “venture of identifying their problems.” ° 

2. The teacher assists the group members (1) to think of problems 
which are of interest and concern to them, and (2) to select one 
problem for the sociodrama. This problem (the one to be drama- 
tized) should be one which— 


a. Is representative of the problems of the group members. 
b. The majority of the group members want to explore. 
c. The teacher is enthusiastic about and willing to have explored. 


3. Students volunteer for the various “parts” or roles to be played. 
It is important that no student be forced to play a role which he does 
not want to play. 

4. Players are “warmed up,” made emotionally ready to play their 
chosen roles. The amount of warming up will vary with the teacher, 
the amount of previous involvement of the students with the prob- 
lem, their age, maturity, and previous experience with the socio- 
dramatic technique, and the rapport which exists among group 
members with each other and with the teachers. 

Jennings says that “it is usually helpful in directing, to assist the 
subject to place himself in time and space and establish (begin to 


? Ibid., p. 271. 


226 EVALUATING STUDENT PROGRESS 


feel) the ‘mood’ of the role before he actually enters the sociodrama. 
An example follows: 


TEACHER: Exactly where are you? 

STUDENT: Down the street. 

TEACHER: Down what street? 

SrupENT: Near the Sphinx Paper Factory, I'm just going— 

TEACHER; What time of day is it? 

Stupent: It's not day, it's midnight —kind of windy— 

TEACHER: What are you wearing? 

SrupENT: My mackinaw, pulled up—it's so cold I can scarcely see— 
(hunches himself, acts hurried). 

TEACHER: Go ahead!” !9 


5. The situation is enacted, during which freedom of expression 
and feeling on the part of players and group members are not only 
permitted but encouraged. No attempt should be made by the direc- 
tor (teacher) during this time to teach grammar, to censure behavior, 
or to control feelings or emotions. 

6. The actions and words of the players are discussed, analyzed, 
and criticized by the group members and the players themselves. 
Recommendations are made by all parties (including the teacher) to 
the sociodrama for changes or improvements in the portrayals of the 
roles by the players. Following such recommendations, either the 
` same or different students may portray the roles again, again fol- 
lowed by an evaluation and discussion. This process—portrayal 
followed by discussion and evaluation—may be repeated several 
times if the group members (including the teacher) so desire. 

Sociodrama is a technique for helping the teacher achieve a better 
understanding of individuals and groups. It is a projective technique ; 
that is, the student projects various aspects of his own personality 
into the dramatic situation and complete freedom of expression is 
essential. If properly used, it is an indispensable evaluative tool. 
However, it does have certain limitations, and the teacher using the 
method should observe certain precautions, which, according to 
Bernard, include the following: 


1. The temptation to overinterpret—to see too much in the behavior 
—tmust be avoided. 


?? Ibid., p. 273. 


INFORMAL TECHNIQUES 227 


2. Permissiveness should not be so great as to endanger children or 
property, but neither should restriction be so great as to inhibit 
the growth of healthy independence. 

3. The classroom should not become a clinic. 


OTHER INFORMAL EVALUATION DEVICES 


Autobiography and compositions 


The autobiography is a useful evaluative tool because it provides 
the teacher with insights into the student's behavior, attitudes, feel- 
ings, emotions, goals, ideals, and values, which could not be obtained 
by questionnaires, interviews, or the more objective test procedures. 
It is an especially appropriate device for use in the high school, 
since during these adolescent years students are prone to become 
introspective and thoughtful about their past, present, and future 
lives. Many times a person will write more than he will say in an 
interview and reveal much more about himself than he realizes. The 
organizing and writing of an autobiography is, therefore, beneficial 
not only to the teacher but to the student as well, since it encourages 
the individual to evaluate himself and his behavior in somewhat 
more objective and comprehensive fashion than he might otherwise 
do. 

In order for the autobiography to be of maximum value to the 
teacher and student, however, certain factors must be considered by 
the teacher : 


1. The autobiography should not be assigned until the student has 
become well adjusted and oriented in the school situation, until he 
has been in the school and with the group at least a month. This 
time will, however, vary with different students and with different 
situations. 

2. At least one, and preferably two or three weeks should be allowed 
for the student to complete his autobiography. The job that is 
done hurriedly is very apt to be nothing more than a chronological 
listing of events in the student's life, which is not the purpose of 
the autobiography. 

3. The autobiography should be assigned by only one person in the 
school. The student should not have to repeat his life story a half 


™ Harold W. Bernard, Mental Hygiene for Classroom Teachers (New York: 
McGraw-Hill Book Co., 1952), pp. 360-61. 


228 EVALUATING STUDENT PROGRESS 


dozen times for different teachers. The best results are likely to 
result if the assignment is made by the English teacher as a regular 
assignment, by the homeroom teacher, by the guidance teacher as 
an assignment in a special guidance class, or by the social studies 
teacher. 

4. No matter who makes the assignment, the writing of a good auto- 
biography should be discussed with the students before the task is 
undertaken. The teacher should read or present excerpts of well- 
written autobiographies to the students. He should, in addition, 
assure the class that all information included in the autobiography 
will be kept in strict confidence and will not be shown to anyone 
else without the student’s knowledge and consent. Adolescents 
(like adults) will write very little of value about themselves if they 
feel the teacher cannot be trusted to keep their thoughts confi- 
dential, 

5. Autobiographies may be completely unstructured, being introduced 
merely by the instruction, “Write the complete story of your life,” 
or it may be more rigidly structured by presenting outlines of 
varying degrees of completeness for the students to follow. 


With no outline to follow, the student is apt to omit certain impor- 
tant phases of his life story; with an outline which is too complete, 
on the other hand, he is so restricted that the autobiography is in 
danger of becoming merely a questionaire. 

An example of an outline which directs the individual’s thinking 
along certain lines, yet leaves him considerable freedom of expres- 
sion, has been worked out by the Los Angeles County Schools: 


1, Early history of the student 

Family background and history 

- Health and physical record 

School history 

epe leisure-time activities, hobbies, travel experience, friend- 
ships 

Occupational experiences 

Educational plans for the future 

Long time vocational plans 

Desires and plans regarding marriage and home life 12 


nawn 


Song 


AR Guidance Handbook for Secondary Schools, Los Angeles County Schools (Los 
Angeles: California Test Bureau, 1948), p. 48. 


INFORMAL TECHNIQUES 229 


As a variation of the complete-life-history type of autobiography, 
the teacher may shorten it by instructing students to write on such 
topics as— 


What I like best (least) about school 

My most interesting experience 

My most satisfying experience 

My most frightening experience 

My favorite television (movie, radio) personality 

My most embarrassing experience 

My favorite relative (uncle, cousin, nephew) 

When I felt completely inadequate 

My favorite “family” fun 

My biggest “gripes” 

My most interesting job 

The teacher may also vary the autobiographical technique by ask- 
ing students to write on only one phase of their life history, such as 
school history, health history, vocational experiences, and so on. 

Another technique, which is not strictly autobiographical in nature 
but is closely related to it, is the use of the “incomplete sentence,” or 
the “open question,” to obtain information about the student's atti- 
tudes, feelings, ideals, and goals. The teacher may use such stimulus 
or “starter” words as— 


I become frightened . . . 

I feel lost . . . 

If I had my way . . . 

Td like to... 

My boy (girl) friend . . . 

My father (mother, brother, sister) . . . 
If I were a parent . . . 


The student completes the sentence and then adds whatever other in- 
formation he desires. The teacher may have him complete one or a 
series of such incomplete sentences. 

In employing the “open question,” the teacher asks the student to 
write freely on a question that reveals a great deal about his ideas 
and feelings. The response made by the student is completely his 
own—unrestricted by outlines or directions imposed by the teacher. 
Since the student sees no “right” or “wrong” answer implied in the 


230 EVALUATING STUDENT PROGRESS 


“open question,” he is likely to give free expression to his feelings. 

In Diagnosing Human Relations Needs are listed numerous open 
questions in several categories—‘School,” “Family,” *One's Self,” 
“Peer Relations,” “Miscellaneous.” Typical items in each category 
are— 


School 
My ideal student 
What I like (don’t like) about teachers 
How school meets my needs 
Family 
Why I like my family 
What roles should parents play toward you? 
With which people older than yourself do you find it easy (difficult) 
to get along? Explain in what way it is easy (difficult) to get along. 
One’s self 
Three wishes 
What I like about myself 
Things you don’t like about yourself 
If I were twenty-one, what kind of a person I would like to be 
What makes me mad 
Peer relations 
Characteristics that I think make good friends 
What is necessary to get along ina gang, with a gang, or with a group? 
Where and with whom do you play? 
Miscellaneous 
What I like about my neighborhood 
What you need in your neighborhood and don’t have 
What does religion mean? 13 


Since the autobiography and its various adaptations is a self- 
report document, its reliability and validity depend largely upon the 
conditions under which it is written and the emotional stability of 
the student at the time of writing. The teacher can go far to assure 
reliability and validity of the autobiography if he will— 


1. Develop good rapport with his students, so that they have confi- 
dence in him and respect for him. 


2. Motivate the students to want to write their autobiography. 


13 Hilda Taba, Elizabeth Hall Brady, John T. Robinson, William E. Vickery, 


Diagnosing Human Relations Needs (Washington: American Council on Education, 
1951), pp. 154-53. 


INFORMAL TECHNIQUES 231. 


Assure them that all information will be held in strict confidence. 
4. Offer to discuss any problem issues in the autobiography with the 

student (if he requests it) and do whatever he can to assist him. 
5. Allow sufficient time for the student to write the autobiography 

without hurrying or causing undue conflict with other assignments. 
6. Not be unduly critical of the technical aspects of the paper, such 
as grammatical construction, spelling, punctuation, and so on. 
These items should not constitute a major focus of attention—at 
least not if the autobiography is to be used as a personal appraisal 
medium. 


[2 


There is also the ever-present danger that the teacher will project 
himself into the events of the autobiography and see things in them 
that are not really there—things that the writer never intended to be 
there. Teachers must be careful lest they become amateur psychi- 
atrists and overinterpret the autobiographical material or use it to 
substantiate or corroborate their own previously formed opinions of 
the student. 


THE PERSONAL DATA BLANK 


The personal data blank is a special form of the questionnaire de- 
signed to obtain certain types of information about a student. In a 
sense it is an information-getting interview in written form. It would 
be ideal, perhaps, to interview personally every student when he 
enters the school, but the ideal is not always feasible or practical. 
This is the case with the entrance interview, especially where large 
numbers of students are involved. The alternative is the use of the 
personal data blank. Its principal value as an appraisal device is that 
it enables the teacher to obtain a great deal of information from a 
great many students in a relatively short time with no special train- 
ing or equipment required. 

The principal uses of the personal data blank are: (1) to obtain 
factual background information about new students; (2) to keep the 
cumulative record up to date; (3) to obtain certain types of infor- 
mation needed for special purposes (e.g., case studies, counseling in- 
terviews, special awards, job placement, etc.), and (4) to ascertain 
facts about a student that might be embarrassing to him if asked in 
a personal interview (especially if the student has not yet become 
adequately oriented). This blank enables the teacher to become ac- 


232 EVALUATING STUDENT PROGRESS 


quainted with large numbers of new students in shorter time than 
would otherwise be possible. 

Each school (or each teacher) should formulate the blank to meet 
its own situational needs. That means, in most cases, that the various 
sections and items of the blank must be keyed to the particular 
cumulative record form in use in the school. If this is done, the in- 
formation may be transferred from the blank to the cumulative 
record with the greatest possible ease and in the shortest possible 
time. It is clear, therefore, that the areas of information on the blank 
will include such items as identifying data about the student, his 
home, and family; health and physical history; school (most and 
least liked subjects, extracurricular plans) ; educational plans; voca- 
tional objectives; interests, hobbies and leisure-time activities; and 
work experiences. 

Tf the information on the blank is not all transferred to the cumu- 
lative record, the entire blank should be filed in the student’s cumula- 
tive folder for future reference. In the latter case, it is desirable to 
provide answer spaces for each year the student is in school. Thus, 
the same blank can be used each year, which makes it possible for 
the teacher or counselor to see changes in the student’s status from 
year to year at a glance. It is strongly recommended that each school 
tailor its personal data sheet to fit its own individual needs. Figure 16 
is an example of portions of a personal data form. 


PERMISSIVE DISCUSSION 


Most adolescents enjoy talking to each other, as witness their in- 
terminable discussions on the telephone. They also are capable of 
good solid thinking about many problems, as witness the many in- 
stances on local, state, and national levels where civic and govern- 
mental officials call upon adolescents to help solve problems in which 
they are directly concerned (e.g., delinquency and highway safety). 
It is helpful to the teacher, in attempting to understand adolescent 
behavior, to enlist students in frequent discussions in a permissive 
atmosphere. They may take the form of panels, forums, or open class 
discussions. In any case it is essential that no restraint or compulsion 
be placed upon the students by the teacher, except to maintain order 
and prevent comments from offending or becoming too personal. 


Fig. 16. Model personal data blank 


PERSONAL DATA BLANK 
(Student Information Sheet) 


Grade Level 


Introduction: (An oral explanation given by teacher 
should include a resume of purposes 
and instructions in filling in the 
form) 


Purposes: 


A. To obtain background information on new stu- 
dents 

B. To bring up to date factual information on all 
students 

C. To secure some background information essen- 
tial to the school counseling service 

D. To obtain background information that will aid 
teachers in understanding and solving stu- 
dent's problems 


Section I 


A. Personal data 
B. Family data 
C. Home data 

D. Health 


Section II 


E. Future plans 
F. Present preferences and interests 


233 


Fig. 16. Model personal data blank (cont.) 
STUDENT INFORMATION SHEET 


INSTRUCTIONS. 


Listed below are questions to be answered and 
blanks to be filled in. After completing one 
part move on to the next. If you come to some- 
thing that you would rather not answer go on to 
the next part. There is no need to hurry as this 
is not a test. 


A. PERSONAL DATA 
Name 


(first) (middle) (last) 
Date of Birth__ Age 
Name of Parents (or guardians) 


Address Tel. No. 
Boljglonkehes hiss ison Dei 


` FAMILY DATA 


l. Is your father living? 

2. Is your mother living? 
Do they live together? 

$. Do you have a Stepfather? 
Do you have a Stepmother? 


w 
. 


4. Place a circle around the last grade completed 
by your: 


father 123456 789 1011 12 College 
mother 123456 789 10 11 12 College 


Do you live with both parents? 
If not, whom do you live with? 
6. What is your father's Work? 


5 


Does your mother work at a job outside the 
home? 


234 


7. 


1. 


7. 


10. 


11. 


Fig. 16. Model personal data blank (cont.) 


Student Information Sheet 


B. FAMILY DATA (cont.) 


How many older brothers do you have? 
How many younger brothers? 

How many older sisters? 

How many younger sisters? 


C. HOME DATA 


About how far do you live from school? 


How do you get to and from school? (Check one) 
Walk . School Bus Bicycle. Car. 


If you live on a farm how many acres does it 
have? 


Is your home (or farm) rented or owned? |. 
How many people live in your home? RE. 
How many rooms are there in your home? 


Is there a language other than English spoken 
in your home? If so, what is the lan- 


guage? 
List the magazines there are in your home. 


About how many books are there in your home? 
Do you have regular home duties? 
If so, what are they? 


Do you ever earn money outside your home? 
If so, how? 
What do you like best about your home? 


(Use the back of this page if necessary) 
235 


Fig. 16. Model personal data blank (cont.) 


` Student Information Sheet 


C. HOME DATA (cont.) 


12. In what ways do you wish that your home were 
different? 


D. HEALTH DATA 


What serious illnesses have youhad? (Do not list 


measles, whooping cough, mumps, chicken pox, 
tonsilitis) 


End of Section I 


Fig. 16. Model personal data blank (cont.) 
Student Information Sheet 


Name. 


Section II 


E. FUTURE PLANS 
1. How far would you like to go in school? 


2. What occupations are you most interested in? 


First choice 
Second choice 


F. PRESENT PREFERENCES AND INTERESTS 
1. What are your favorite subjects in school? 


2. Which subjects do you dislike most? 


3. What kind of reading do you enjoy most? (Check 


two) 
comics peer science 
| magazines newspaper 
fiction $2 autobiography 


4. What kind of movies do you like best? 


5. What are your two (2) favorite TV programs? 


6. Do you play a musical instrument? 


If so, what? 
If not, would you like to play one? 


7. Do you have any hobbies? 
If so, what are they? 


237 


Fig. 16. Model personal data blank (cont.) 


Student Information Sheet 


F. PRESENT PREFERENCES AND INTERESTS (cont. ) 


8. 


9 


10. 


11. 


What outdoor sports or games do you enjoy most? 


What do you like most to do in your spare time? 


Of what clubs, organizations, or groups are 
you a member? 


If you had three wishes come true, what three 
would you want more than any others? 


228 


INFORMAL TECHNIQUES 239 


Topics for discussion should, of course, be of interest and concern 
to the entire group and should grow or develep naturally out of the 
experiences and activities of the group members. Some typical dis- 
cussion questions might include: 


Is it a good idea to “go steady” in high school? 

Is vandalism a sign of the times? 

Should the voting age be lowered to eighteen? 

How can reckless driving be curbed? 

Are secret societies desirable in high school? 

Is it possible to be popular without belonging to a gang? 
Are the morals of young people declining? 


Discussions of questions such as these enable the teacher to ap- 
praise his students’ attitudes and feelings about themselves, their 
peer relationships, and the society in which they live. 


PROJECTIVE TECHNIQUES 


Among the more difficult devices for the teacher to use in student 
appraisal is the so-called “projective technique,” defined as “a 
method of studying the personality by confronting the subject with 
a situation to which he will respond according to what that situation 
means to him, and how he feels when so responding.” 14 

Bernard says “the term projective techniques refers to a means 
whereby a more or less neutral situation is given meaning by the in- 
dividual responding to it.” 1 

Included among the projective techniques are the Rorschach Test, 
the Thematic Apperception Test, the Picture-Story Test, puppetry, 
drawing and painting (crayon, pencil, paint), sculpturing (clay), and 
word association tests. All these methods are basically the same, i.e., 
“the subject gives meaning to the situation and thus reveals what he 
himself is." 16 

The two most widely known projective techniques are the Ror- 
Schach Test and the Thematic Apperception Test (familiarly known 
as the TAT). The Rorschach consists of ten cards upon which are 


M Lawrence K, Frank, Projective Methods (Springfield, Illinois: Charles C. 
"Thomas, 1948), p. 46. 

18 Harold W. Bernard, Mental Hygiene for Classroom Teacher (New York: 
McGraw-Hill Book Co., 1952), p. 303. 

18 Thid, 


240 EVALUATING STUDENT PROGRESS 


inkblots, some black and white, others in various colors. They 
are shown to the subject one at a time and he is asked to tell what 
the blot looks like to him. His response may be “A giant beetle,” 
“A design for a summer dress,” or “Two pixies playing patty-cake.” 
Since the blots are meaningless in themselves, the subject’s responses 
are indicative of his inner self and not an attempt to give the “right” 
answer. The responses are analyzed on the basis of several factors— 
the speed with which the subject responds, whether he reacts pri- 
marily to the black or the white fields, his reaction to color, evidence 
of action in the design, and so on. Although this test was designed to 
be administered individually, attempts are being made to administer 
it to groups by projecting the blots on a screen and providing ready- 
made responses from which the subject elects the one which most 
nearly fits his own interpretation of the blot. 

The Thematic Apperception Test consists of a series of thirty pic- 
tures and one blank card. Certain cards are designated for boys 
(under fourteen years of age), for men, for women, and for girls 
(under fourteen years of age). The subject is shown the various cards 
designated for his age and sex one at a time and is asked to tell a 
story about each in his own words. He is urged to tell a complete 
story, i.e., what happened before, what is happening, and what is 
likely to happen. In doing this it is assumed that the subject will 
project himself into the scene in some way, usually by identifying 
with one of the characters, It is for this reason that a distinction is 
made between the pictures shown to persons of different age and sex. 
It is easier for a boy, for example, to project himself into a scene 
showing a little boy sitting on the doorstep of a log cabin than into 
a picture showing a little girl climbing a winding flight of stairs. 

The examinees’ stories about the pictures are recorded in some way 
(tape recorder, stenographic record, or notes made by the examiner). 
They are then analyzed and interpreted in terms of evidences of 
wishes, conflicts, fears, desires, or fantasies, The TAT is also being 
experimented with in an attempt to adapt it to group use by project- 
ing the Pictures on a screen and providing objective responses in 
multiple-response form. 

Tt must be emphasized that, although projective techniques are ex- 
tremely valuable in the appraisal of personality and behavior, the 


i eel -O 


INFORMAL TECHNIQUES 241 
difficulty of interpretation makes them impractical as evaluative 
tools for the teacher. This is true particularly with the Rorschach 

and the TAT. A teacher may obtain valuable leads or clues as to a 
- student's problems or conflicts by an informal, casual appraisal of 
his art work, drawings, clay sculpture and/or other work products, 
however, and it is recommended that every teacher be alert to the 
evaluative potentialities of these media and activities. 


CHAPTER 
12 


The Case Study and the Case Conference 


THE CASE STUDY 


THE CASE-STUDY TECHNIQUE, although relatively new in the field of 
education, is taking on ever-increasing importance both as a method 
of evaluating students and in more adequately providing for their 
needs. “As long,” says Traxler, “as teachers were interested mainly 
in teaching subject matter to groups of pupils, they had no real need 
for case studies. However, the recent tendency to take account of in- 
dividual differences and the emphasis on mental hygiene and guid- 
ance have brought into sharp focus the need for understanding each 
pupil. Consequently, an increasing number of schools are turning to 
the case-study method. . . .”1 

It is clear from Traxler's statement that the major purpose of the 
case study is to contribute to a better understanding of each student. 
To this end, two principal types of case-study methodology can be 
identified: (1) the informal, continuous, informative procedure, and 
(2) the formal, diagnostic, problem-oriented type. 

The case-study method has been criticized because it takes too 
much of the teacher's time, it requires a deep knowledge and under- 
Standing of human development and behavior, and it requires skills 
in interviewing, testing, and test interpretation, as well as in other 
technical fields, which the classroom teacher usually does not possess. 
These criticisms have been leveled almost wholly at the formal case 

1 “Case Study Procedures in Guidance,” Educational Records Supplementary Bul- 
letin B (New York: Educational Records Bureau, 1946), p. 2. 

242 


THE CASE STUDY AND THE CASE CONFERENCE 243 


study. Such criticisms are not defensible when aimed at the informal 
procedure, which is a continuous and integral part of the teaching 
process. In this latter sense the case study is a continuing process to 
obtain information concerning the development of all pupils in the 
school. Each teacher in the course of observing and testing his stu- 
dents, interviewing and/or conversing with them, reading their auto- 
biographies, examining their work products, their personal data 
sheets, and their cumulative records, as well as in numerous other 
ways, acquires many important facts about them. All that remains in 
order to convert this process into a case study is to interpret the ac- 
cumulated facts in terms of the student’s over-all development (or 
some specific aspect of his development) ; come to a conclusion con- 
cerning the best or most effective learning experience(s) for the stu- 
dent; provide the experiences; and evaluate the results of the experi- 
ences through a repetition of the information-collecting procedures. 
Tt seems reasonable to assume that a professionally-minded teacher 
makes such a procedure a phase of his daily teaching procedure. 

Thus, each student becomes the subject of the teacher's individual 
concern and the focus of his attention. This attitude of continuous 
attention to each student is highly important—far more essential to 
the welfare of the student and the success of the teacher than the 
more formal, step by step processes of the structured case study, ex- 
cept for the difficult problem cases occasionally encountered. 

It is important to note in summary that the informal, continuous 
case study— 


Is a phase of good teaching and a skill of the professional teacher. 
Is a phase of the regular teaching process, not an appendage to it. 
Is necessary for adequate evaluation of the “whole child.” 

Can involve or utilize all the usual techniques for appraisal of 

human growth and development. 

5. Is a significant aid in the prevention of serious problems and mal- 
adjustments, inasmuch as incipient difficulties are “nipped in the 
bud." 

6. Provides a practical basis for the individualizing of instruction— 
methods, techniques, materials, and experiences. 

7. Makes the teacher ever more conscious of students as unique 

human beings with their own individual peculiarities, problems, 

hopes, and frustrations. 


EC 


244 


8. 


EVALUATING STUDENT PROGRESS 


Enhances the professional growth of the teacher by contributing 
to the development of a more scientific attitude and approach to 
the teaching process. 

Is designed primarily to prevent serious problems rather than to 
“cure” them. 


The formal case study 


Serious problems will occasionally occur even with the best of edu- 
cational programs. It is for the investigation of these more complex 
and deep-seated maladjustments that the formal case study is a valu- 
able tool. Although the formal case study includes the same steps, 
in general, as the informal, continuous study described in the pre- 
ceding paragraphs, the formal case study has certain important 
characteristics which differentiate it from the more informal study. 


t 


2, 


Ordinarily only one student, or a very few, can be studied at any 
one time by any given teacher. 

A rather well-structured form or pattern (although the form need 
not be identical for every case) is usually followed. 


. All (or at least most) of the information or data assembled rela- 


tive to the case, as well as the interpretation of the data, the 
diagnosis, the remediation recommended or suggested, and the 
follow-up are accurately recorded in written form. 

Much more data of a special or technical nature are usually ob- 
tained than is true with the informal study. 

The services of specialists (doctors, nurses, psychiatrists, child 
guidance clinics, social workers, and the like) are frequently em- 
ployed. 

Effort is most often directed at the diagnosis and solution of a 
rather severe problem. 

Very complete and comprehensive cumulative record data are a 
must if teachers are to use the method at all. 


The formal case study begins when a problem is detected, usually 
through observation of the student, and corroborated by a careful 
petusal of cumulative records. Once a problem is recognized the case 
study proceeds to include these principal parts : 


is 
Ze 


3. 


Collection of data about the student 


Organization of these data into meaningful, practical, usable cate- 
gories 
Interpretation of the data 


| 


THE CASE STUDY AND THE CASE CONFERENCE ^ 245 


4. Analysis of the problems, needs, and plans of the student in the 
light of the data 

5. Tentative diagnosis of the basic problem 

6. Discussion and agreement concerning the appropriate therapy or 
‘treatment 

7. Implementing of the agreed-upon remedial procedure(s) 

8. Follow-up (evaluation) of the results of the therapy 


Although, typically, these are the major parts of a case study, it is 
not intended that they be regarded as a time-sequence procedure. 
There is bound to tbe considerable overlapping and running together 
of the various parts as the study proceeds. This is to be expected and 
should be anticipated by the teacher about to begin a case study. 

Step one. Inasmuch as it is necessary, in conducting a case study, 
to obtain the facts about the student, the cumulative record should 
be consulted as the initial phase of the investigation. Most of the im- 
portant data relating to the case will already have been assembled 
and organized there. The question of which data are important and 
should, therefore, be recorded as part of the case study is dependent 
to some extent upon the nature of the case and the purpose of the 
study. However, in any case, even with a very limited and specific 
difficulty, the problem can best be diagnosed and remediation sug- 
gested when the problem is interpreted against the background of the 
“whole student.” Therefore, all the available data included in the 
cumulative record should be carefully examined, since every item of 
information may be important. It is assumed that a complete record 
will include the student’s personal health and social history, test 
scores (principally achievement, intelligence, and aptitude), academic 
achievement, interests, and personality information. If any of these 
data are not included, those that seem pertinent or significant should 
be obtained as part of the first, or data gathering, phase of the study. 

Step two. Since it is somewhat unlikely that very recent informa- 
tion about any student will be contained in his cumulative record 
(most facts are recorded or entered at specified times or intervals 
during the year), it is usually necessary to interview all persons who 
have had recent contact with the student, including classroom 
teachers, class counselors, homeroom teacher, gym teacher or coach, 
school nurse, and librarian, as well as the parents, clergyman, and 
even other students. Written records should be made of each inter- 


246 EVALUATING STUDENT PROGRESS 


view, or each interested person should be asked to write a brief state- 
ment concerning the student in answer to the case worker's specific 
questions regarding him. 

Step three. Information is obtained from the student himself 
through interviewing him, administering additional tests to him, par- 
ticularly diagnostic tests, personality or adjustment inventories, atti- 
tude scales, and questionnaires. In many instances, the results of the 
administration of such instruments will provide convenient leads or 
cues for the interview which should follow. 

Step four. The information is assembled in a written form con- 
venient for reviewing, interpreting, and analyzing the data. Thus far 
the case study is not significantly different from a case history. The 
unique aspects of the case study appear in steps five, six, and seven 
which follow: 

Step five. The assembled information is studied intensively, ana- 
lyzed, and interpreted in the light of the original problem or difficulty 
with a view to arriving at a tentative conclusion or diagnosis con- 
cerning the cause or causes of the problem. Having arrived at such a 
diagnosis, the conclusions are written up and become a part of the 
case study record. 

Step six, Having arrived at a diagnosis of the problem, it next be- 
comes necessary to formulate a plan of action designed to alleviate 
or “cure” the difficulty or solve the problem in whole or in part, and 
to record, in writing, a plan of treatment or therapy. It may be as- 
sumed that having arrived at a plan of treatment or therapy that 
action would be taken to activate the plan at the appropriate time. 

Step seven. As the treatment proceeds, the results should be evalu- 
ated in terms of the progress made by the student in overcoming his 
difficulty or solving his problem. This phase of the case study may 
be continued in some cases as long as the case worker has any con- 
tact with the student in a way that would lend itself to appraisal of 
the student, or at least until such time as it appears that the treat- 
ment is leading to definite success or failure. 

It is not uncommon for a case Study to be undertaken with a view 
to solving what appears to be a very obvious problem, only to discover 
during the course of the study that the “obvious problem" is only a 
Symptom and that the real problem is not so evident or obvious. It i$ 
important, therefore, that the teacher conducting the case study 


THE CASE STUDY AND THE CASE CONFERENCE 247 


clearly differentiate between the overt or surface manifestations of 
difficulty and the basic problem (which is usually not discovered 
until all the data are recorded, analyzed, and interpreted). In order 
to be perfectly clear, the initial statement of the problem should 
actually be termed "reason for study" rather than "the problem." 

One further word of caution may prove helpful especially to the 
teacher inexperienced in case-study work. It is extremely necessary 
to approach every case with an open mind, with no preconceived con- 
clusions, and with an attitude of readiness to switch courses if it 
appears that the original line of investigation is achieving no results. 
In other words, the case study should not become a technique for 
the purpose of corroborating the teacher's personal, invalidated judg- 
ment concerning the causes of the student’s problem. As is true with 
every instrument or tool, regardless of its expressed purpose or func- 
tion, the case study is no better than the person using it. 

There is no single form or outline which will fit every teacher’s 
needs when it comes to assembling and organizing the data for a 
case study. Similarly, there is no magic formula which tells one what 
specifications or areas of information are to be included in the study. 
In so far as the outline to be followed in making the study is con- 
cerned, any pattern which is meaningful to the teacher and which 
arranges data in a convenient form is satisfactory. It is usually con- 
venient, however, in those instances where a fairly complete cumula- 
tive record is kept for every student, to have the outline of the casé 
study agree with the cumulative record. This facilitates transcribing 
data from the cumulative record to the written case-study report. As 
far as the items of information to be gathered and assembled for the 
study are concerned, the nature of the case, the teacher’s time and 
skill, and the facilities available for obtaining data will determine 
what and how much information will be collected. In spite of the 
fact that many authors and authorities have presented outlines to be 
followed in the conduct of a case study with little agreement as to 
Specific items of information to be included, there is rather general 
agreement concerning certain main elements. The authors present an 
adaptation of Rivlin’s outline of the case-study method, not as a 
composite but rather as an illustration of an outline which embodies 
the essential elements of a complete case study and a guide to the 
teacher to help him give direction to his efforts. 


248 . EVALUATING STUDENT PROGRESS 


I. History and Descriptive Information 


A. Identifying data (the student): Name, address, school, grade, 
sex, age (birth date), teacher (homeroom teacher), nationality ` 
(race), color, religion, significant and objective comments de- _ 
scribing physical appearance of the student, condition of cloth- 
ing, obvious physical or mental limitations, mannerisms. 

B. Reason for the study (the complaint): Specific incident(s), - 
setting and probable causes, plus name(s) of person(s) making 
complaint. 1 

C. Personality Traits 
General emotional tone; for example, cheerful, moody, etc. 
Attitude toward his family (father, mother, siblings, others) 
Attitude toward his school (teachers, administrators, others) 
Attitude toward his friends 
Attitude toward himself, his abilities, and problems 
Play life 
Hobbies 
Educational and vocational ambitions 
Marked likes and dislikes 
Unusual fears 
Results of special tests (projective, etc.) 

Any special personal problem? 

D. Educational Status 
Age at entrance to first grade 
Present school achievements 
History of retardation or acceleration k 
Special deficiencies and proficiencies (results of diagnostic tests) _ 
Past record in work and conduct 
Schools attended—type—location 

E. Results of Medical Examination 
Physical defects 
Efficiency of sensory organs (vision—hearing) 

General condition of health 

Nutritional status 

eun with normal height and weight (height-weight 
ratio 

Muscular coordinations 

Reduced or exaggerated reflexes 

Twitchings, tics, tremors 

Peculiarities of gait or speech 

Previous health history 


H. 


III. 


THE CASE STUDY AND THE CASE CONFERENCE 249 


F. Results of Mental Examination 
nea quotient | also subtest scores, if available 
Results of achievement tests 
Special abilities 
Special disabilities 
Vocational aptitudes and interests 
G. The Home Environment 
The individuals living at home (number, age, relationship, sex) 
Apparent economic level 
Apparent social status 
Parental methods of discipline 
Parents’ emotional disposition 
Attitude toward this child 
Possibilities of securing the home’s cooperation 
Unusual customs, traditions observed 
Cultural resources (educational level of parents, etc.) 
Relations within home (parent-child, child-child, etc.) 
Record at other social service agencies 
H. The Neighborhood Environment 
Recreational facilities 
Housing and living conditions 
Desirability of his playmates 
Any special obstacle to adjustment 
I. Social Background and Activities (outside school and home) 
Church affiliation and attendance (also Sunday school) 
Boy Scout, Girl Scout, Hi-Y, 4-H, Future Farmers, Local Youth 
groups 
Summer camp attendance 
Civic organizations (including municipal band, orchestra, ath- 
letic teams, etc.) 
Gang affiliations 
Sexual irregularities 
Court record 


Summary of Case Data 
A condensation of the sum total of all significant facts assembled 


in light of the problem being investigated. 


Diagnosis 
A practical workable hypothesis or guess as to the cause or 
causes of the explanation of the problem under consideration, based 


250 EVALUATING STUDENT PROGRESS 


upon all the evidence obtained and recorded. It is worth repeating 
that an original diagnosis is seldom final; new hypotheses may be 
formulated and new diagnoses made as new and additional evi- 
dence is obtained. 


IV. Treatment and Follow-up 

The actual treatment grows out of the diagnosis and may be con- 
sidered the culmination of the case study. If the recommended 
treatment or therapy proves effective, it verifies the soundness of 
the diagnosis; if it proves ineffective, it suggests that either the 
diagnosis was in error, the wrong treatment was used, or conditions 
surrounding the case (either the individual himself or his environ- 
ment) have changed. Under such circumstances the case may need 
to be restudied or a different form of treatment be tried. In any 
event, an effective follow-up or evaluation of the case from the 
standpoint of the success of the treatment in bringing about an 
improvement in the student’s adjustment is imperative.” 


Limitations of the case study method 


In previous sections of this chapter, the advantages and values of 
the case study as a technique of judging the growth and development 
of students have been discussed. It is equally important to the teacher 
to be aware of the limitations of this, as of every other method of 
studying students. Knowing the limitations of a method makes the 
teacher not only more careful in its application but also more con- 
servative in accepting the results. Erickson has enumerated seven 
limitations of the case method from the standpoint of the school 
counselor. However, since the same limitations apply when the 
teacher makes the study, they are presented here with the word 
teacher substituted for counselor: 


1. There may be a lack of accuracy in the data presented. 

2. The Person using the data may play his “hunches” instead of 
really interpreting the data. 

3. The teacher may not be competent enough to evaluate (appraise) 
the value of the data. 

4. The teacher may try to shortcut or abridge the process of getting 
tbe necessary information, 


? Harry N. Rivlin, Education for Adjustment: The Classroom Application of 
Mental Hygiene (New York: D. Appleton-Century-Crofts, Inc., 1936), pp. 108-10. 


THE CASE STUDY AND THE CASE CONFERENCE 251 


5. The teacher may lack an understanding of human beings and 
therefore be unable to use the data effectively. 

6. A great deal of time is required to collect and interpret case study 
materials. 

7. The teacher can nullify the entire process by jumping to conclu- 
sions, by acting in a way not warranted by the data, or by taking 
the responsibility for action away from the counselee (student) .° 


A SAMPLE CASE STUDY 


The following case study is presented, not as an ideal to be copied, 
but rather as an illustration of a “first” study made by an experi- 
enced teacher. Although the case contains errors and shortcomings, 
it is the authors’ hope that it will encourage other teachers to attempt 
studies of their pupils. Notwithstanding the fact that it is not a pro- 
fessional job, the teacher who made the study gained a considerable 
amount of insight into and understanding of this student, with the 
result that she was able to assist Susan in making greater progress 
and in improving her over-all adjustment. 


The Case of Susan 


Problem 


The student was referred by her mother, who came to the teacher- 
counselor for suggestions as to how she could best help her daughter in 
school. She said that her daughter no longer cared to remain in school. 
She appeared reticent to discuss the reason why her daughter wished to 
drop out. When asked what the daughter planned to do if she left school, 
the mother answered, “Just stay home—the only thing she could do . . .” 
When further questioned, there was no answer. The teacher-counselor 
then asked if there were some things that the daughter liked about 
School, to which the mother replied, *She loves to play basketball, and 
I'm sure that is what has kept her in school until now. Also, her desire 
to become a nurse . . ." 


Personal 


Susan, 15, was a ninth-grade high school freshman, 5 feet, 4 inches 
in height, and weighed 108 pounds. She was a small-boned, slender, at- 
tractive girl, and appeared to be very shy. Her school health record 


? Clifford E. Erickson, A Practical Handbook for School Counselors (New York: 
Ronald Press Co., 1949), p. 42. 


252 EVALUATING STUDENT PROGRESS 


showed hearing and vision both to be normal. It also showed that she 
had had both measles and chicken pox before starting to school, and 
scarlet fever at the age of ten. Her complexion was clear, 

Susan was the youngest of six children, Two older brothers dropped 
out of school about the tenth grade, left home, and now have good jobs. 


TEST NAME & FORM NORM |Z Percentile Rank Scale 


n GROUP |j. |9 13 9 m pomem a vw mo ial 
MENTAL 
1 | Kuhimam-Anderson Grade 4 1.o. [ne 
2| Otis Grade 7 ia. fi20 
Hi, Sch. 1.9. | uas 
Fr. 
‘ACHIEVEMENT 
1 [Towa Test of Eduestional Development ml 
Social Studies m 
Natural Science background L3 
Correctness in Writing [7 
Quantitative Thinking [7] 
Reading — Social Studies [7 
Resding. Natural Science [3 
Reading— Literature * I 
General Vocabulary ry 
se of Source Material [1 
Diagnostic Redding 
Rate of Reading NS 
Story Comprehension o2 
[7 [3 
Total Comprehension [3 
INTEREST 
1 | Kuder Preference Record 
Outdoor cO 
Mechanical i 
Computational Te S 
Scientific c3 
Persuasive Tw 
Artistic E] DEL(CTUNEMSTEITV.. 
Uterary -* Ta 
Musical 30 
Social Service % 
Clerical e 
APTITUDE 
Y | SRA Clerical Aptitudes HS. Fr. 
Office Vocabulary e 
Oftice Arithmelie “ 
tfr Checking C 
Total Score s 


Fig. 17. Profile chart of Susan's test record 


An older sister finished high school, worked in a private home, and then 
married a farmer. Two other sisters were at home, one working in a 
local restaurant and the other attending high school as a junior. 

Susan’s father was a day laborer and was reported to be a good 
worker. He sometimes worked on the railroad in the summer, but more: 
often at odd jobs in town and in the country. He usually received good 


THE CASE STUDY AND THE CASE CONFERENCE 253 ` 


wages with the exception of two months during the winter when work 
was scarce. During this time merchandise for the family was charged at 
the local store. The bill was frequently left unpaid until the amount had 
accumulated to the point of causing a mortgage to be put on the home. 
The home had been given to Susan’s mother by her father. The father 
had dropped school after the ninth grade. He had helped his father in 
his blacksmith shop and had worked on farms during the summers. On 
one of these farms, he had met Susan’s mother and they had married 
just before he left for service in World War I. 

The mother had finished high school and had then worked in a mer- 
cantile store until the death of her mother. She had then returned to the 
farm to keep house for her father. She had married against her father’s 
wishes although he had bought her a house. She had worked until just 
before her first son was born. 


Interview with Susan 


The establishment of rapport with Susan was at first difficult. Her 
timidity caused her to speak in a barely audible voice. She appeared re- 
luctant to talk about anything pertaining to home. At the mere mention 
of basketball, however, she began to talk more freely. When a referral 
was made to the state basketball tournaments, she burst into tears. 

She appeared well read and seemed to enjoy talking of books, maga- 
zine articles, and radio programs. She stated that she obtained books and 
magazines from the public library quite regularly. She also used the 
school library. She said she liked writing book reviews but didn’t like 
giving them orally. 

Susan stated that her school records were good. She was getting all 
A’s except in English in which she received B’s. She said her English 
teacher took off for her oral work. She indicated that her grade school 
marks had been A’s and B’s with the exception of C’s in art and music. 


Interviews with the family 

In the first interview at the home both parents were present. They 
were anxious for their daughter to continue in school. Both seemed 
evasive about her reason for desiring to quit. They admitted that she had 
no trouble in school with either her studies or her teachers and fellow 
classmates. The father was especially proud of his daughter’s scholastic 
and athletic records and stated that he had hopes for her success in both 
lines. However, he refused to be present at the next meeting to be 
arranged after the teacher-counselor had again talked with Susan. He 
told the teacher-counselor, “Her mother will tell you more when I’m not 
around. I hope you can help our daughter.” 


254 EVALUATING STUDENT PROGRESS 


Tn later interviews with the mother, the counselor learned that at the 
time of Susan’s birth, the father had been a periodic drinker, usually 
around paydays. He had often driven the mother out of the house, but 
had remained calm, peaceful, and repentant when sober. He exhibited a 
violent temper but was quick to get over his anger. The mother had wor- 
ried considerably about bills and had often been irritable and short- 
tempered with the older children. The father and mother had been forty- 
three and thirty-nine respectively when Susan was born. 

The three girls had worked and played together but had quarreled 
considerably when they were younger. The two older girls had thought 
Susan was babied after she had had scarlet fever. The mother again 
seemed hesitant to continue the discussion, but was encouraged by the 
counselor’s remark, *We both want to help Susan. We can do so only if 
we work together from the information we have.” The mother then re- 
vealed that Susan had suffered a bladder disorder and nervousness fol- 
lowing scarlet fever, and that-she had had to leave school for a year. She 
had, therefore, dropped behind her classmates, She was required to do 
no work at home. The older girls had felt imposed upon and had in 
turn taunted Susan when their mother was not around. She felt this had 
served to increase her nervousness. 

About that time the family moved to another town where the father 
found work in a foundry. Susan was sent to a parochial school, the only 
school in town, until the seventh grade, at which time the family re- 
turned to their original home. Susan then attended the public school. She 
immediately showed interest in sports, especially basketball. She was 
considered a good forward and, with the shortage of girls in the small 
high school, was allowed to play on the high school team while in the 
eighth grade. Because of the team’s good record, the girls were taken to 
the state tournament, but Susan could not go. Since she was ten, her 
bladder weakness had prevented her from staying away from home over- 
night. One of the girls had Suspected that to be the cause of Susan’s 
inability to accompany the team on the trip and had made remarks 
about it to her. The mother also stated that her daughter had always 
been reluctant to bring her friends to the home because she feared her 
father might be drinking or in one of his moods. 

At that point, the counselor suggested that the mother take Susan to 
a doctor, which action she at first flatly refused. She seemed to feel that 
a bladder difficulty was a disgrace, and that, since they lived in a small 
town, everyone would know about it and drop her (the mother) from 
their social groups. She also informed the counselor that her father had 
used to whip her brother for a similar weakness. She thought her 
daughter would in time outgrow the weakness just as her brother eventu- 


THE CASE STUDY AND THE CASE CONFERENCE 255 


ally had. With some persuasion, she promised to give the matter some 
thought, however. 

About a year previous to the interviews, the mother had decided to 
leave her husband. His concern caused him to stop drinking and to take 
a steady job in the lumberyard. Within a short time, the mother had 
fallen heir to her share of her father’s small estate. With that income, 
she had paid their debts and put into the home some much needed im- 
provements. The father had then begun to take pride in installing addi- 
tional improvements. Since that time, he had been urging his wife to 
terminate her work in a store on Saturdays; at the time of the inter- 
views, she hadn’t done so. 

The older of Susan’s sisters came into the room during one of the 
interviews and made the following comment: “In the fourth grade, Susan 
was chosen for the lead in the Christmas cantata, which required wear- 
ing a long, white gown. She wouldn’t take the part because her gowns 
were blue checked, and she wouldn’t ask the folks for another because 
she knew they couldn’t afford it.” 


Interviews with teachers and neighbor 


A neighbor for whose children Susan was often a baby sitter reported 
her to be “so conscientious—I never worry when she is with the chil- 
dren. If we are late, she won’t stay all night, but always insists on going 
home.” 

Susan’s former third- and fourth-grade teacher recalled: “She was an 
avid, eager pupil. Work was easy for her. She exceeded all of her class 
in reading and dramatizing. She was a happy, smiling little girl, but I 
was surprised when she moved back from to see how changed 
she seemed—so sober, nervous, and even bashful.” 

The comments of Susan’s general science teacher, who was also her 
basketball coach, were: “She seems so changed in attitude from the 
science class to the gym floor. She is so responsive in a game, but in class 
she doesn’t recite well at all. She can write like a college professor. She 
seems to want to learn so badly that she occasionally forgets herself and 
talks out exctiedly, particularly during laboratory experiments, but when 
she realizes it, she blushes and won’t say another word. Her classmates 
seem proud of her good work and really admire her.” 


Later interviews with Susan 


An autobiography which Susan was asked to write during the second 
interview told little. It didn’t come up to her usual standard of writing. 
She gave the following as her philosophy: “I want to make as many 
friends as possible, to help people, especially the sick ones, to be tolerant 


256 EVALUATING STUDENT PROGRESS 


toward the point of view of others and to get the best things out of life 


‘by doing what is right." 
Following is a part of the second interview with Susan: 


TEACHER-COUNSELOR: “You want to drop out of school. Don't you 
like it?” 

Susan: "I love it.” 

TEACHER-COUNSELOR: “But you want to stop?” 

Susan: “I just can't go on, knowing they are laughing at me." 

TEACHER-COUNSELOR: “Who is laughing at you?” 

Susan: “All the kids." 

TEACHER-COUNSELOR: “Why are they laughing?" No answer. 

TEACHER-COUNSELOR: "Sometimes things that trouble us are not 
so real.” 

Susan: “It’s real all right.” (Bitterly) 

TEACHER-COUNSELOR: “You want to become a nurse.” 

Susan: “More than anything in the world.” 

TEACHER-COUNSELOR: “To be one you must be a high school 
graduate.” 

Susan: “I know,” 


Later: 


Susan: “I can't have friends. I could never ask them to my house.” 

"TEACHER-COUNSELOR: “Why not?” No answer. 

TEACHER-COUNSELOR: “If you can’t be a nurse, what would you 
like to do?” 


Susan: “Be a social worker, but I’m afraid I couldn’t meet people." 


In a discussion concerning sports, Susan indicated that she had been 
very fond of playing ball with her brothers. She stated they had often 
protected her from their father's ire when he was drinking. 

She reported that she had never had many new clothes, that she had 
.Worn mostly those made over from the clothes of her sisters or others 
that were given to her. She did not attend school functions because of 
having no new clothes. She said she felt very sensitive about her made- 


over ones. She stated that she used to tell lies to keep her classmates . 


from knowing that she had not received as many gifts as they had for 


Christmas and for birthdays. She indicated that she didn't tell them any- 
thing any more because it didn't seem to matter. 


Interpretation of data 


Susan had, seemingly, changed from a normal nine-year-old in the 
fourth grade to a confused, nervous, shy, and almost completely intro- 


—— 


THE CASE STUDY AND THE CASE CONFERENCE 257 


verted girl by the ninth grade. Her serious illness followed by continued 
bladder difficulty, her mother’s attitude toward this weakness, the re- 
sultant compulsion to drop behind her regular class, the experience of 
being the recipient of her sisters’ wrath when she wasn’t required to 
help with the housework, constant fear of bringing friends to her home 
because of her father’s drinking, fear of remarks from her class and 
teammates, and lack of a true confidant—all seemed to be factors con- 
tributing to Susan’s present condition. 


Recommendations 


It was felt that with the correction of Susan’s physical condition, 
along with a build-up of her general health, she might gradually over- 
come some of her shyness. It was further felt that she might then find 
friends who would absorb her into social groups. : 

It was recommended that since home worries had lessened and the 
family's financial status had improved, more interest be shown in Susan's 
clothes. It was suggested that she might be allowed to select some with 
the parents! help and some on her own initiative. It was recommended 
that she be encouraged to bring her friends to the home and that she 
accept invitations to their homes and to school functions. 

Her teachers were urged to encourage Susan to give oral talks in class, 
among small groups at first, and then to larger ones. Her coach and sci- 
ence teacher offered to enlist her assistance in explaining experiments to 
the class. 

It was further recommended, in view of the test data (see profile, 
Fig. 17) and since she had no desire for college training but wished to 
follow a nursing career, that her curriculum include biology, chemistry, 
and other scientific subjects which would give her background for that 
field. 

In view of Susan's mental ability, it was pointed out to her that not 
only would she be capable of achieving a college education from the 
standpoint of mental ability but that she might eventually wish to go to 
college. It was recommended that should such a desire develop she might 
secure a scholarship if her good grades continued. In an effort not to 
detract from her ambition to become a nurse without obtaining a degree, 
some stress was placed on the desirability of college training in her par- 
ticular case for two reasons: (1) her potential ability, and (2) the fact 
that she was also interested in becoming a social worker. 


Follow-up 


During the following December, the counselor saw Susan play in two 
basketball games. She played very well in both contests. After the sec- 


258 EVALUATING STUDENT PROGRESS 


ond game, Susan talked to the counselor a few minutes and said, “I’m 
glad, you know. Mother says you think I can be helped, but she is doubt- 
ful. If I could be, I could have friends like other girls. She’s going to 
take me to during Christmas vacation to see one of the doctors 
you recommended, Thanks a lot. I feel different now that someone seems 
interested. Dad is different, too, now that he doesn’t drink.” 

In February, both parents went to see the counselor. They: had taken 
their daughter to a doctor who reported there was a marked weakness 
due, perhaps, to scarlet fever and that he was sure she could be helped. 
He had performed a slight operation, prescribed a diet, and had given 
Susan some medicine, Both Parents expressed remorse over not having 
taken care of the difficulty before. They reported that they had already 
observed a physical improvement and they added that their daughter 
Seemed much happier. She hadn’t talked of quitting school since Christ- 
mastime. 

Susan’s high school “sister” had taken more of an interest in her and 
urged her to attend some of the school parties, The parents had arranged 
for some new clothes for her also. 

Her science teacher reported considerable success in his encourage- 
ment of her explanations of experiments to the class. He also felt she 
was finding more friends among basketball team members, 

At the basketball tournament, the counselor again talked a few min- 
utes with Susan. Little was said, but as she left to dress for the game, 
she whispered, “Thanks, Thanks a lot!” 


THE CASE CONFERENCE 


An important adjunct to the Case study in appraising the develop- 
ment of the student is the “case Conference.” Warters says, “The case 
study is probably the most useful method for revealing and evalu- 
ating long-term trends in a person's development, and the case con- 
ference is probably the best method for synthesizing or coordinating 
and interpreting data gathered from various sources,” * The confer- 
ence is essentially a meeting at which a particular student is dis- 
cussed by those persons who know him or know something about him 
for the purpose of arriving at conclusions or plans for improving his 
(the student’s) adjustment or aiding his progress or development. 

The conference usually opens with the chairman, or leader, stating 
the general nature of the Problem to be discussed and the purpose of 


“Jane Warters, Techniques of Counseling (New York: McGraw-Hill Book Co. 
1954), p. 284, 


THE CASE STUDY AND THE CASE CONFERENCE 259 


the conference. Following this he presents pertinent data from the 
student’s cumulative record. In turn, each participant then contrib- 
utes information which he considers essential to arriving at a conclu- 
sion or plan of action. Either during or following the presentation of 
the data, the participants may ask each other questions in order to 
clarify certain points or bring out additional information. After all 
the facts are known (it must be stressed that gossip and hearsay have 
no place in a case conference), a diagnosis of the problem is arrived 
at and recommendations for the student’s treatment are agreed upon. 
Further, the persons who are to be responsible for carrying out the 
recommendations are either appointed or otherwise selected by the 
group. At the conclusion of the conference, the secretary prepares a 
summary of the proceedings, including data presented, the diagnosis, 
and the recommended therapy or ameliorative procedure. 

The conference technique has value not only to the student under 
consideration but also to the personnel—especially the teacher or 
teachers—involved in the conference. It has been the authors’ experi- 
ence that teachers, after a case conference, never fail to express a 
much deeper appreciation and understanding of the dynamic inter- 
relationships of home, school, neighborhood, and personal character- 
istics in the shaping of a student’s development and behavior. In 
addition, it makes the teacher much more objective in his appraisal 
of the student and less apt to judge him “dumb,” “lazy,” “a pest,” 
“a show-off,” on the basis of purely superficial observation of his 
surface behavior. 

In some instances a case conference may include as many as twenty 
or more persons, all of whom have significant contributions to make 
to a better understanding of the student. Such persons may include 
the school superintendent or principal, one teacher or several, a coun- 
selor, nurse, physician, psychologist, psychiatrist, social worker, 
speech correctionist, juvenile judge, reading specialist, clerygman, 
parents, and even (though not in every case) the student. It should 
be obvious, of course, that not all of these people need be brought 
into every conference, and that many conferences may be carried on 
very effectively with only a small number of persons present. 

In order to make the case conference of maximum effectiveness as 
a technique of appraisal and judgment of growth and progress, cer- 
tain suggestions have been made by Fenton: 


260 EVALUATING STUDENT PROGRESS 


1. The conference must be a co-operative effort. . . . Tolerance 
toward opposing points of view is necessary. The ideal of service 
to children should permeate the discussion. As far as possible irrel- 
evant matters should be omitted and the clash of individual per- 
sonalities and points of view avoided. . . . The conference group 
exists primarily for the child’s welfare. If one member must coerce 
or dominate the others to carry a point which he believes will help 
a child, the technique has failed in its purpose. 
Recommendations should be evolved at the conference. 
Recommendations should be carried out promptly. 
Sudden improvement should not be expected. 
Improvement in the individual case must be viewed in perspective 
. change in the child must be assayed . . . in terms of that 
particular child. The significant criterion is the extent to which he 
improves in his behavior in comparison with his earlier self rather 
than how nearly “normal” he becomes in comparison with other 
children. 
6. Recommendations must not be dogmatic. . . . The recommenda- 
tions of the . . . conference must not be considered final and 
rigid, but should instead have enough flexibility so that they may 
be varied according to the needs of the child and the changing 
circumstances of the case. 
Parental co-operation must be obtained. 
8. Publicity and gossip should be avoided. It is especially important 
that no publicity whatsoever be given to the children studied. The 
importance of a professional attitude toward the . . . conference 


needs to be mentioned. Obviously, gossip about the cases must be 
forbidden.5 


mAN 


x 


_ The authors should like to add emphasis to the last suggestion. It 
is extremely important that the case conference be regarded as a 
Strictly professional procedure and it should end when the partici- 
pants leave the conference room. The lunchroom, teachers’ lounge, 
Streets, or corridors are not proper places for continuing a discussion 
of a student begun in the conference room, 


5 Norman Fi enton, Mental Hygiene in School Practice (Stanford, California: Stan- - 


ford University Press, 1949), pp. 80-86, 


——— — — ÁO" 22 


CHAPTER 
13 


Standardized Tests — 


Some General Considerations 


A POINT OF VIEW REGARDING TESTS 


TESTS ARE AN IMPORTANT INSTRUMENT and essential in helping the 
teacher to understand the student better. They should not be re- 
garded as a mechanical tool, and they cannot make decisions for us ; 
but they do play an important part in making decisions about people, 
and in detecting differences between individuals as a basis for pre- 
diction and diagnosis. 

In earlier discussions the authors have reviewed the application of 
many so-called informal, subjective, nonstructured techniques to the 
appraisal of student growth and development. There is no refuting 
the fact that such devices and methods are of inestimable value in an 
over-all evaluation of student progress, since there are no quanti- 
tative tools yet available to accurately measure progress toward some 
of.the objectives of modern education. With the possible exception 
of good teacher-made tests of the objective type, however, none of 
these techniques provide us with very precise, quantitative measures 
of the differences between individuals. There is danger of too much 
subjectivity entering into the guidance of students if such guidance 
is based entirely on observations, interviews, inventories, scales, and 
other devices of limited objectivity. It is wise—and necessary—to in- 
clude standardized test instruments in the evaluation program as a 
check on these other appraisals and techniques. Conversely, of 
course, the nonstructured techniques yield information which should 
be used to help us validate our test data. In short, the significance of 


261 


262 EVALUATING STUDENT PROGRESS 


test scores is greatest when they are used in conjunction with a full 
study of the student, by means of interviewing, case history, obser- 
vation, rating scales, and other methods. The full value of test results 
is not realized unless an adequate cumulative record is maintained 
for each student to provide the background material against which 
the test data can be interpreted. 

Tests must be regarded as but one means by which individual dif- 
ferences may be appraised. The extreme view held by some educators 
that tests supply all the information necessary for the guidance and 
instruction of students is just as unwise and illogical as the opposite 
belief that tests provide no useful data whatever. The truth of the 
matter, of course, rests somewhere between the extremes. Tests can 
be very useful if properly used and the results properly interpreted 
and applied. As with any tool, the effectiveness of a test depends 
upon the competency of the person who uses it. In the hands of a 
careless, unskilled, and untrained individual, any test, no matter 
how potentially good, is likely to provide results which are worse 
than useless; actually misleading. To say, therefore, that fests are 
useless is just as foolish as to say that automobiles should be out- 
lawed because they occasionally injure people. It is, of course, the 
driver, not the car, which is at fault. The same reasoning applies to 
tests that cause errors to be made in prediction or diagnosis. The 
fault lies with the tester not the test. 


WHAT IS A STANDARDIZED TEST? 


The term standardized test has come to signify a measuring in- 
strument with six major characteristics : 


l. The test is designed to measure important common outcomes of 
representative courses of study, avoiding items which are likely to 
be taught in relatively few schools. 

2. Specific directions for administering the test have been worked 
out and are stated in detail, usually providing even the exact words 
to be used by the examiner and specifying exact time limits. By 
adhering to the instructions, teachers in many schools can admin- 
ister the test in essentially the same way. 

3 Specific directions are provided for Scoring. Usually a scoring key 
is supplied which reduces scoring to merely comparing the answers 
with the key; little or nothing is left to the judgment of the scorer- 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 263 


Sometimes carefully selected samples are provided with which the 
child’s product is compared. 

4. Norms are supplied to aid in interpreting the scores. Norms, based 
on administration of the test to large numbers of children, provide 
a basis for comparing a child’s score with representative scores for 
different ages and grades; or with representative scores of children 
of his own age or grade. 

5. Information needed for judging the value of the test is provided, 
Before the test becomes available for purchase, research is con- 
ducted to produce satisfactory information about the, test’s re- 
liability and validity. 

6. A manual of directions is supplied which explains the purposes and 
uses of the test, describes briefly how it was constructed, provides 
specific directions for administering, scoring, and interpreting re- 
sults, contains tables of norms, and summarizes available research 
data on the test.! 


TYPES OF STANDARDIZED TESTS 


Standardized tests are available to measure a vast number of 
human traits, characteristics, and abilities in the areas of intelligence, 
achievement, and personality. In fact, Hildreth lists a total of 5,294 
separate test titles in the 1945 supplement to the second edition 
(1939) of the bibliography of mental tests and rating scales.” 

The “Classification of Tests” page from The Fourth Mental Meas- 
urements Yearbook gives further evidence of the wide variety of tests 
available.’ 

To organize or arrange this huge assortment of instruments into a 
relatively small number of classes is no small task. Such a classifica- 
tion might be made on the basis of several different criteria—form, 
purpose, content, and other characteristics. One classification might 
be in terms of whether the test is designed primarily for adminis- 
tration to groups or to individuals, 

Because of the relative simplicity of the directions for administra- 
tion and recording or marking of answers, and the fact that close 


Theodore L. Torgerson and Georgia S. Adams, Measurement and Evaluation for 
the Elementary-School (New York: Dryden Press, 1954), pp. 36-37. 

? Gertrude H. Hildreth, A Bibliography of Mental Test & Rating Scales, Supple- 
ment to second edition (New York: Psychological Corporation, 1945). 

? The Fourth Mental Measurements Yearbook, ed. Oscar K. Buros (Highland 
Park, New Jersey: Gryphon Press, 1953), p. vii. 


25 


CLASSIFICATION OF TESTS IN 
The Fourth Mental Measurements Yearbook (1953) 


W——————————————————————————— 
1, Achievement Batteries 


Character and Personality 


a. Nonprojective 
b. Projective 


. English 


à. Composition 
b. Literature 
€. Spelling 

d. Vocabulary 


. Fine Arts 


a. Art 
b. Music 


. Foreign Languages 


a. English 
b. French 


8. 


13. 


. Reading 


. Science 


. Sensory—Motor 


. Social Studies 


Miscellaneous (cont.) 
h. Industrial Arts 

i. Philosophy 

j. Psychology i 
k. Record and Report Form _ 
l. Religious Education 3 
m. Safety Education 

n. Testing Programs 


a. Miscellaneous 
b. Oral 

c. Readiness 

d. Special Fields 
e. Study Skills 


a. Biology 

b. Chemistry 

c. General Science 
d. Geology 

e. Miscellaneous 
f. Physics 


a. Hearing 
b. Motor 
c. Vision 


a. Economics 4 
b. Geography : 
c, History 

d. Political Science 

€. Sociology 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 265 


observation and/or supervision of the individual testee is not con- 
sidered in the interpretation of the scores, almost any number of 
students may be examined at one time by the group test. The actual 
number is limited, theoretically, only by the amount of space and the 
number of test copies available. Actually, however, there are certain 
factors which make it desirable to keep the number of students to be 
tested at any one time reasonably small. 

Although, theoretically, group tests may be administered to any 
number of students at one time, there are certain advantages to re- 
stricting the group to a reasonable size : 


l. It avoids the excessive noise and confusion usually present in very 
large groups. 

2. It more nearly approximates an average size class, therefore, mak- 
ing students feel more at ease than in a very large assembly. 

3. It facilitates administration of the test by enabling the examiner 
to get the attention of each student more easily and be heard and 
understood more clearly. 

4. It makes it possible for the teacher (examiner) to observe students 
who are not working efficiently because of illness, failure to follow 
directions, and other such reasons. 

5. It makes it easier for the examiner to develop rapport with the 
group. 

6. It saves time in distribution of booklets, answer sheets, and other 
special equipment. 

7. It makes it possible for one teacher (examiner) to do the testing 
and proctoring, making sure that students start and stop on signal 
(in the case of timed tests this is very important). 


There is no reason, however, why a group test cannot be admin- 
istered to a single individual if such a procedure is necessary or 
desirable, 

The individual test is usually so constructed that administration 
to more than one person at a time is almost, if not entirely, impos- 
sible. Certain factors in the make-up and administration of the test 
make this true: 


1. A high degree of rapport must be built up with the examinee by 
the examiner to assure the maximum performance by the examinee. 

2. Directions to the examinee many times involve demonstration by 
the examiner of certain desired performance by the examinee. 


266 EVALUATING STUDENT PROGRESS 


3. Careful oral questioning and/or observation of the examinee is 
often required during testing. 

4. The examiner rather than the examinee must record most of the 
answers or reactions of the examinee. 

5. Since a certain amount of urging or repeating of directions is per- 
mitted on some tests, the examiner must be free to talk to the 
examinee without disturbing others. 

6. Since some test items are not timed and others are subject to very 
rigid time limits, it would be very impractical to keep a group 
working at precisely the same rate. 

7. The cost of the test materials in many cases is very high (as much 
as twenty dollars or more for one set), making it impractical to 
have enough sets to allow many people to perform at one time. 


Examples of the individual test are the Stanford Binet Intelligence 
Scale and the Thematic Apperception Test. It is interesting to note 
that some tests, such as the Thematic Apperception Test, once used 
exclusively as individual tests, are now being converted to group use 
by altering the method of responding to the pictures. 

Another basis for classifying tests is whether the subject answers 
questions or solves problems by writing the answer or by drawing, 
or whether he is required to demonstrate his skill by manipulating 
objects or apparatus in some fashion. The former would be classified 
as paper-and-pencil tests, the latter as performance tests. Most group 
tests are of the paper-and-pencil type, while many individual tests 
are of the performance variety, at least in part. 

Tests may also be classified according to whether they are objec- 
tive or subjective, although strictly speaking this classification is not 
actually true, since no test is 100 per cent objective. The distinction 
really is one of degree of objectivity rather than an either-or dichot- 
omy. Objective-type tests or procedures are those in which the stu- 
dent’s responses are scored or summarized in such a way that all 
Scorers would agree as to the score assigned or the analysis made of 
the examinee. Multiple-choice tests, typing tests, and arithmetic com- 
putation tests illustrate the objective type test. The subjective test 
is one in which the personal judgment, opinion, or interpretation of 
the scorer affects, to a greater degree, the score assigned to a response 
of the examinee. Typical subjective procedures are the essay exami- 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 267 


nation, appraisal of a student’s performance in a piano recital, or 
the estimate of his leadership as a member of a committee. 

Speed tests may also be contrasted with power tests to form an- 
other basis for classification. Speed tests may be of two kinds: 
(1) those in which the subject is given a fixed number of tasks to do 
or problems to solve and the time it takes him to complete them is 
measured ; or (2) those in which he is allowed a fixed amount of time 
and the number of tasks or problems he completes in that time is 
recorded. In such tests the level of difficulty of the tasks or problems 
is usually such that the subject certainly could complete them if 
given unlimited time. Speed is the decisive factor in such tests. In 
contrast, power tests present the examinee with a series of problems 
or situations of graded difficulty, beginning with those that are quite 
simple and progressing to a level where few if any subjects can suc- 
ceed in solving them. The examinee is merely asked to do as many 
problems as he can, with no limits set as to working time. Thus, the 
limiting factor in his score is the ability (power) of the subject, not 
his speed. 

Many tests incorporate the factors of speed and power in a single 
instrument. Any mixture of these factors is possible, depending upon 
the amount of time allowed and the difficulty of the items. To deter- 
mine if, or to what extent, speed is an important criterion of success 
in a test, the examiner must determine whether the subject completed 
all or nearly all the items he could have answered if he had been 
allowed unlimited time. 

Probably the most useful and practical classification of tests is 
based upon the expressed purpose of the test. Such a classification 
would include five main types: 


1, Aptitude 
a. General (mental ability, scholastic aptitude) 
b. Special (mechanical aptitude, clerical aptitude, algebra apti- 
tude, etc.) 
c. Readiness (reading, math) 
2. Achievement in subject fields 
a. Survey 
b. Diagnostic 


268 EVALUATING STUDENT PROGRESS 


3. Interest 

a. General 

b. Educational 

c. Vocational 
4. Personality or adjustment 
5. Attitude (opinion) 


Any classification of tests is purely arbitrary and merely made for — 


the sake of convenience, since no clear-cut lines of demarcation exist 
between the various types. Basically all tests may be considered - 
measures of achievement. What the subject has acquired in the way _ 
of knowledges, skills, abilities, interests, aptitudes, attitudes, or 
opinions (his achievements) determines his score on any given test, 
no matter what its stated purpose or title may be. Whether, for ex- 
ample, a person scores high or low on a test of mechanical aptitude 
depends upon the sum total of the knowledge and skill he has | 
achieved in the field of mechanics during his lifetime. Likewise, the 
Score a person makes on an attitude scale is determined by what he 
has learned (again, achievement), positively or negatively, about a 
variety of objects, persons, situations, organizations, groups, and 
So on. 


WHAT DO TESTS CONTRIBUTE TO THE EDUCATION 
OF YOUTH? 

Every phase or aspect of the program of the school, whether it be 
in terms of method, materials, equipment, plant facilities, adminis- 
trative procedures, or what not, has but one ultimate purpose—to 
facilitate the optimum growth and development of the student. Thus, 
“the proper function of a test in school is to improve the educa- | 


tional program. It may do so by helping plan what learning experi- — 


ences a pupil needs, by indicating ways in which teaching can be 
Improved, or by building attitudes in pupils and teachers which will 
Promote better teaching." 4 


The most important function of any method or technique of evalu- — 


ation and appraisal, objective or Subjective, is to provide an accu- 
rate, reliable, and valid basis for the guidance of the student, not in 
narrow vocational terms, but in the broad sense of assuring the 
maximum utilization of his Capacities, interests, and characteristics. 


“Lee J. Cronbach, Essentials of Psychological Testing (New Vork: Harper & 
Brothers, 1949), p. 299. 1 


STANDARDIZED TESTS-SOME GENERAL CONSIDERATIONS 269 


Other uses of tests, although essentially secondary and contribu- 
tory to the guidance function, may be itemized as follows: 


1. To highlight the range of individual differences between students 
within a grade or class. In spite of the fact that a class may be 
“average” or above as a group, there is likely to be great vari- 
ability in the capacities, achievements, interests, or other char- 
acteristics of the individuals that comprise the group. These dif- 
ferences are clearly indicated by the range of scores made by the 
students in the tests. Thus, a graphic and convincing basis is pre- 
sented for the individualization of instruction. 

2. To provide a basis for dividing a class or grade into well-defined 
subgroups for corrective, remedial, or other special attention. 

3. To provide the teacher, early in the year, with an indication of 
the general level of proficiency of a class in a subject or course 
as a basis for initiating an effective program or method of in- 
struction. 

4. To analyze or diagnose the specific weaknesses and strengths of 
individual students, especially in the basic areas, reading, mathe- 
matics, and spelling. 

5. To compare accomplishment of individual students in various 
subjects in relation to their ability. 

6. To determine the amount of progress made by a class and/or in- 
dividual students during a given period, i.e., unit, semester, year, 
and so on. 

7. To compare the attainment or status of a given class or section 
with the status of other classes or sections in the same school or 
system or with other schools or systems on the basis of local or 
national norms. 

8. To check on the effectiveness of different methods of teaching, 
i.e., lecture, discussion, "activity," and the like. 

9. To provide data for effective counseling with parents, i.e., to 
bring about a better understanding of the strengths and limita- 
tions of the boy or girl, or justify a grade given in a subject. 

10. To obtain objective evidence concerning various curricular and/or 
instructional matters. This use of tests is referred to as “action 
research,” inasmuch as practical problems are investigated as a 
regular phase of the school’s administrative supervisory instruc- 

. tional activities and practices. 

11. To provide objective data about students to report to colleges, 

universities, and prospective employers. 


270 


12. 


13. 


14. 


15; 


16. 


17 


18. 
19, 


20. 


EVALUATING STUDENT PROGRESS 


To provide objective data with which to compare subjective ap- 
praisals made of students by teachers, supervisors, and others. 
This means that test results should be compared with teacher ap- 
praisals in such areas as achievement, intelligence, aptitudes, 
and/or personality factors. If wide discrepancies are noted, more 
intensive study or analysis of the situation is indicated, Thus, 
neither the subjective appraisal nor the objective test results 
should be considered the infallible criterion, but each should be 
a check on the other, 

To provide evidence upon which to base the placement of a new 
student about whom inadequate or incomplete data are available 
concerning past educational experience. 

To check on the validity or adequacy of a grading system in a 
school. 

To provide motivation for improving the achievement of students 
in certain subject areas. It is a well-known fact that individuals 
learn better when they are tested over the material learned. This 
does not mean that tests should be used as whips or threats to 
force students to study for tests alone. However, if results are 
made known to students, their basic desire for success and ego- 
satisfaction will serve to motivate them to put forth greater effort 
to maintain or improve their status, 

To help students gain a more realistic concept of themselves with 
respect to abilities, interests, aptitudes, and achievement. It is a 
fact, for example, that many students are helped to crysallize 
their vocational interests as a result of the administration of an 
interest inventory and the subsequent discussion of the results 
with a teacher or teacher-counselor. 

To give students an objective basis upon which to judge their 
adequacy or aptitude for post-high school training in college, 
university, technical, or trade school, 

To give students a tangible, quantitative basis for gauging the 
effectiveness of their work habits and methods of study. 

To help the teachers determine the places where too little and/or 
too much emphasis has been given in a subject. 

To provide invaluable data for the study of the all-round devel- 
t of individual pupils through the use of the cumulative 
record. 


From the foregoing discussion it should be apparent that tests can 
serve a variety of purposes and provide a tremendous range and 
variety of information about students to assist the teacher in doing 


ee 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 271 


a more effective job of instruction and guidance. “Objective tests 
have occasionally been criticized on the ground that they cause a 
pupil to lose individuality and to become simply a point in a distri- 
bution. Nothing could be farther from the truth. When the results of 
a variety of tests over a period of years are brought together and re- 
corded on a well-organized cumulative record, each little point (each 
score or percentile, which in itself has almost no meaning) takes on 
meaning from its place in and its relationship to the total pattern. 
As test results are added to the record, literally hundreds of inter- 
relationships and combinations of the data in a single cumulative 
record become possible. As one studies these relationships, a living, 
growing individual emerges. Thus, test results, properly used, do not 
cause us to lose sight of individuals; rather, they help us to see 
these individuals more clearly, and as they really are." 5 


SUGGESTED TESTING PROGRAM 


The remainder of this chapter contains a suggested testing pro- 
gram for the public schools of Iowa as recommended by a committee 
of the Iowa State Department of Public Instruction. It can be used 
as a guide for planning a program of standardized testing. The plan 
illustrates the comprehensive nature of such a program. 


I. Some Features of a Balanced Testing Program 

The school should make every effort to employ at least one person 
who is trained in the use and interpretation of standardized tests. 

Before any test is given, the testing personnel should have estab- 
lished a definite reason for giving the test and the use that is to 
be made of the results. 

A practical functioning testing program will include group test bat- 
teries given at designated grades for each student when he is en- 
rolled in that grade. It will also include tests that can be given 
to individual students or small groups of students as need indi- 
cates. 

The areas of reading, arithmetic, study skills, academic aptitude, 
listening, interest, and personal adjustment need to be sampled 
at more than one time in each child’s school career. It is not nec- 
essary to sample each area more than once to get an indication of 
the quality of the growth being made by each individual. 

5 Arthur E. Traxler, et al., Introduction to Testing and the Use of Test Results in 
Public Schools (New York: Harper & Brothers, 1953), p. 95. 


272 


Ii. 


III. 


EVALUATING STUDENT PROGRESS 


The individual or small group part of the testing program should 
have possibilities for the sampling of such areas as spatial rela- 
tions, mechanical comprehension, art judgment, color discrimina- 
tion, music aptitude, finger and arm dexterity, language skills 
and clerical skills. 

The well-balanced testing program will contain other tests that are 
comparable to the tests used in the regular batteries. In every 
group of students who take a battery of tests at a given time, 
some few students’ scores seem to be in error. It is necessary 
to re-test these few students with a comparable test to evaluate 
better the quality of individual growth being made. 

Balance in a testing program requires that each student has an 
opportunity to be sampled in the areas common to most stu- 
dents. It also requires that each student has an opportunity to 
be sampled in the areas unique to himself. 


Some of the important purposes for giving group tests such as 

achievement, interest, and personality are: 

A. To see how the score of each individual compares with the 
scores made by the group on the same test, at the same time, 
under the same conditions. 

B. To find out which questions each individual answers correctly 
and which questions each individual answers incorrectly. 

C. To see how the scores made by the small group compare with 
those made by a cross-section group. 

D. To measure individual growth. This requires the giving of com- 
parable tests at more than one time in each student’s school 
career so that a cumulative growth picture is revealed. 

E. To measure the effectiveness of various methods of teaching. 

F. To determine whether the material being taught is either too 
difficult or too easy for the specific group. 

G. To help in curriculum planning by groups as well as by indi- 
viduals, 

H. To determine whether the objectives of the class are being 

achieved. 


Some important questions to consider when deciding about the test- 

ing program are: 

A. Does the test have comparable forms at more than one grade 
level so that an individual growth picture may be indicated? 

B. How many times should the same type of ability, achievement, 


IV. 


HOO Ss HY A 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 273 _ 


interest or adjustment be measured during the student’s school 
career? 
How much time can be justifiably taken from the class work for 
testing purposes? 
. Is there time to use the results for all the tests selected? 
Can the expense of purchasing, correcting, recording, and inter- 
preting the results be justified for each test selected? 
Was each selected test standardized on the basis of using a large 
cross-sectional group of boys and girls? 
What portion of each test is valid for the levels at which it is 
proposed to be used? 
. Is the reading difficulty of the test within the ability of the 
pupils to be tested? 
Is someone qualified to make proper interpretation of the re- 
sults? 


Gal 


Suggested Minimum Group Testing Program 

A. Readiness—Last Semester kindergarten or grade 1 

B. Intelligence or academic aptitude—Grades 2-5-9-11 or 1-4-7-10 

C. Achievement 

. Reading—Grades 2-4-6-8-11 

. Arithmetic—Grades 3-5-7 

. Spelling—Grades 4-6-8-10 

. Study Skills—Grades 4-7-10-12 

. All areas of The Iowa Tests of Educational Development at 
least twice for each pupil. 

D. Interest—Grades 8 and 11 if 8-4 plan of organization or 9 and 

11 if 6-3-3 plan of organization 


Cn 4A CS PO o 


. Suggested Expanded Testing Program— To be used with individuals, 


small groups, or large groups as the local personnel feels best. 

A. All that is suggested for the minimum program 

B. Achievement tests given each year 

C. Iowa Every-Pupil Test of Basic Skills, Grades 3-8, and Iowa 
Tests of Educational Development, Grades 9-12, each year 

D. Adjustment, mental health analysis, personality or problem 
check list—Probably best given to individuals as necessary un- 
less you wish to get a group picture for a specific purpose. 

E. Spatial relations—Grades 9 and 11 

F. Mechanical comprehension (boys)—Grades 9 and 11 

G. Art judgment for interested students—Any level above grade 4 

H. Color aptitude for interested students—Any grade level 


274 


VI. 


EVALUATING STUDENT PROGRESS 


I. Finger and arm dexterity for interested students—Grades 9-10 
or 11 (where student wishing to enroll in classes where good 
dexterity most essential) 

J. Language skills—Grades 9-10 or 11 

K. Clerical skills—Grades 9-10 or 11 

L. Listening—Grades 9 and 11 

M. Algebra aptitude— Grade 8 

N. Geometry aptitude—— Grade 9 


Suggested Possible Tests That Can Be Used.’ 

A. Readiness 

1. Metropolitan Readiness Test—The Psychological Corpora- 
tion or The World Book Company or The Bureau of Edu- 
cational Research and Service (5), (7), (1) 

2. Harrison-Stroud Reading Readiness— The Bureau of Educa- 
tional Research and Service (1) 

3. Lee-Clark Reading Readiness Test— California Test Bureau 
(3) 

B. Intelligence or academic aptitude with forms from kindergarten 

through twelfth grade 

1. California Short-Form Test of Mental Maturity—California 
Test Bureau (2) 

2. California Test of Mental Maturity—California Test Bu- 
reau (2) 

3. SRA Primary Mental Abilities—Science Research Associates 
(6) 

4. Henmon-Nelson Tests of Mental Ability—The Psychological 
Corporation or The Bureau of Educational Research and 
Service (5), (1) 

- Kuhlmann-Anderson Intelligence Test—The Psychological 
Corporation or The Bureau of Educational Research and 
Service (5), (1) 

6. Otis Quick-Scoring Mental Abilities Test—The Psychologi- 
cal Corporation or The Bureau of Educational Research and 
Service or The World Book Company (5), (1), (7) 

7. Kuhlmann-Finch Intelligence Test—The Bureau of Educa- 
tional Research and Service (1) 

C. Intelligence or academic aptitude with forms from grade 9 

through 12 

1. Thurstone Test of Mental Alertness—Science Research As- 
Sociates (6) 


Iz 


* See pp. 276-77 for addresses of publishers as numbered. 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 275 


2- 


3i 


4. 


E 


Terman Group Tests of Mental geh. World Book 
Company (7) 
Holzinger-Crowder Uni-Factor Tests—The World Book Com- 
pany (7) 
American Council Psychological Examination, High School 
Edition—Cooperative-Educational Testing Service (4) 
"Differential Aptitude Tests—The Psychological Corporation 
(5) 
D. Achievement 

1. Reading 


a. 


H 


k. 


Durrell-Sullivan Reading Capacity and Achievement, 
Grades 2-6—The Bureau of Educational Research and 
Service (1) 


. Gates Basic Reading Test, Grades 3-8— The Bureau of 


Educational Research and Service (1) 
Towa Silent Reading Test, High School and College—The 
Bureau of Educational Research and Service (1) 


. Kelley-Greene Reading Test, High School and College— 


The Bureau of Educational Research and Service (1) 
Nelson-Denny Reading Test, Senior High and College— 
The Bureau of Educational Research and Service (1) 
Reading Comprehension Test C, Grades 7-12—Coopera- 
tive-Educational Testing Service (4) 


. Detroit Reading Test, Grades 2-9—The World Book Com- 


pany (7) 


. Metropolitan Reading Test, Grades 3-9—The World Book 


Company (7) 

Stanford Reading Test, Grades 3-9— The World Book 
Company (7) 

California Reading Test, Grades 1-14— California Test 
Bureau (2) 

Triggs Diagnostic Reading Test, Survey Section, Grades 
7-13— Committee on Diagnostic Reading Tests, Inc. (3) 


E. Achievement battery (Reading-arithmetic-etc.) 

California Achievement Test, Grades 1-14— California Test 
Bureau (2) 

Metropolitan Achievement Tests, Grades 1-9—The World 
Book Company or The Bureau of Educational Research and 


Service (7), (1) 


T]: 


2: 


. Stanford Achievement Tests, Grades 1-9—The World Book 


Company or The Bureau of Educational Research and Service 


(7), (1) 


276 


T. 


EVALUATING STUDENT PROGRESS 


4. Essential High School Test Battery, Grades 10-13—The 
World Book Company (7) 

5. Iowa Every-Pupil Tests of Basic Skills, Grades 3-9—The 
Bureau of Educational Research and Service, Extension Di- 
vision, State University of Iowa (1) 

6. Iowa Tests of Educational Development, Grades 9-12—The 
Bureau of Educational Research and Service, Extension Di- 
vision, State University of Iowa (1) 

7. SRA Achievement Series, Grades 2-9—Science Research As- 
sociates (6) 


F. Interest 


1. Kuder Preference Record, Form C (Vocational)—Science 
Research Associates (6) 

2. Strong Vocational Interest —The Psychological Corporation 
(5 


3. Occupational Interest Inventory— California Test Bureau (2) 


G. Personal 


1. California Test of Personality— California Test Bureau (2) 

2. Thurstone Temperament Schedule— Science Research Asso- 
ciates (6) 

3. Personal Audit—Science Research Associates (6) 

4. Mental Health Analysis—California Test Bureau (2) 

5. Kuder Preference Record, Personal—Science Research Asso- 
ciates (6) 


H. Special aptitude 


1, sia Aptitude Tests—The Psychological Corporation 
5) ; 

2. Flanagan Aptitude Classification Tests—Science Research 
Associates (6) 

3. Engineering and Physical Science Aptitude Test—The Psy- 
chological Corporation (5) 

«© Social 

1. SRA Youth Inventory—Science Research Associates (6) 

2. EE Problem Check List—The Psychological Corona 
tion (5 

3. SRA Junior Inventory— Science Research Associates (6) 


Publishers from Whom Tests Are Available 


Bureau of Educational Research and Service, 
Extension Division, State University of Iowa, 
Iowa City, Iowa 


STANDARDIZED TESTS—SOME GENERAL CONSIDERATIONS 277 


2. California Test Bureau, 
110 South Dickinson Street, 
Madison 3, Wisconsin 

or 

5916 Hollywood Boulevard, 
Los Angeles 28, California 

3. Committee on Diagnostic Reading Tests, Inc., 
419 West 119th Street, 
New York 27, New York 

4, Cooperative Test Division, 
Educational Testing Service, 
Princeton, New Jersey 

5. Psychological Corporation, 
522 Fifth Avenue, 
New York 36, New York 

6. Science Research Associates, 
57 West Grand Avenue, 
Chicago 10, Illinois 

7. World Book Company, 
2126 Prairie Avenue, 
Chicago 16, Illinois 


CHAPTER 
14 


Standardized Tests — Application 


As A Basis for adequate guidance and instruction, the teacher needs 
information of two general types concerning each student: (1) his 
present status—where he now stands with respect to his abilities, in- 
terest, achievement, and personal and social adjustment ; and (2) his 
growth potential—how far and in what direction he can be expected 
to go in terms of his capacities, limitations, and needs. 


The first type of information includes facts “concerning the stu- 
g 


dent's level of general ability and the nature of any special abilities 
he possesses. The knowledge and skills he has acquired through both 
school and out-of-school experiences form a part of the ‘present 
status’ picture. Also, much of what he is now is determined by his 
relations to others, his initiative, his feelings of security, the degree 
of self-confidence he displays, and other elements which enter into 


needed to determine just what the child brings to the learning situ- 
ation at the outset.” 1 


tion of growth. Here facts are “needed regarding basic interests of 
the individual, the kinds of goals he has set for himself, and the 


personal and social adjustment. Information in all of these areas is 
The second type of information concerns rate, ceiling, and direc- 

| 

| 

§ 


appropriateness of these goals in terms of his general ability and 


t " F. if 
special aptitudes. Actually, prediction and judgment are involved in | 


answering this question. How far and in what directions can he (the 


* Arthur Traxler, et al., Introduction to Testing and the Use of Test Results in 
the Public Schools (New York: Harper & Brothers, 1953), p. 5. 


278 


STANDARDIZED TESTS—APPLICATION d 279 


student) be expected to go in terms of his capacities, interests and 
needs? It is essential, though, that this projection be made if the 
teacher is to assist the individual in maximum fulfillment of his par- 
ticular capacities and the direction of energies toward attainable life 
goals." ? 

These various types of information may be arranged in five cate- 
gories: 


General (scholastic) aptitude or learning ability (the I.Q.) 
Special aptitudes or abilities 

Achievement in different fields of study 

Educational and vocational interests 

Personal and social adjustment 


This arrangement is also useful for classifying the types of tests 
commonly used in most secondary school testing programs. 

Any classification of tests is arbitrary and purely for the sake of 
convenience, since no clear-cut lines of demarcation separate one 
type from the other. As a matter of fact, all tests are actually achieve- 
ment tests, since they measure the progress of the individual in some 
phase of his growth or development. 


[Nm 


MEASURING SCHOLASTIC APTITUDE 


“Intelligence testing" has been a common practice in most schools 
for many years. The results of such testing, expressed as I.Q.'s, were 
used to attempt to predict how well the student would succeed in 
school. Since it was obvious to any thoughtful person that no such 
instrument could measure intelligence per se, the more accurate title, 
“scholastic aptitude,” is now generally used to describe tests of this 
type. Such tests attempt to measure the student's capacity to succeed 
in the kinds of activities and experiences encountered in the school 
curriculum, both subject matter learning and adaptation. 

All so-called intelligence testing is based upon inference and as- 
sumption, since capacity cannot be measured directly and the poten- 
tiality for mental development is constantly in the process of being, 
realized. Therefore, we make the assumption that mental ability— 
What the individual knows and can do on intellectual tasks—will 
vary with capacity if opportunities for development and motivation 


? Ibid., p. 6. 


280 EVALUATING STUDENT PROGRESS 


have been equal, Then we infer intelligence for mental ability. How- 
ever, we cannot measure ability directly either; what we actually 
measure is performance. In effect, then, we measure performance, 
from which we infer ability, from which we infer capacity. 

The basic assumption underlying aptitude testing, including both 
scholastic and specific, is that two people, both experiencing the 
Same environmental background, will succeed on a given task in 
direct proportion to their native capacity or endowment for that par- 
ticular type of activity. In effect, then, every aptitude test is a 
measure of achievement, and we assume (or predict) that if Jim has 
learned more from Ais past experience in particular areas than has 
Mike from Ais, then Jim has more native capacity for learning in that 
(those) area(s) and will continue to do better in the future. Two 
conditions are of paramount importance in aptitude testing: (1) that 
each student has equal opportunity and motivation, and (2) that 
each will continue to achieve at the same rate following the testing 
before. 

Although the classroom teacher is not called upon frequently to 
administer or score scholastic aptitude tests for his students, he 
should understand the basis for such measurement and be able to in- 
terpret the scores intelligently. Two scores are ordinarily derived 
from the results of such tests—the intelligence quotient, or I.Q., and 
the mental age, or M.A. Practically all the newer tests of this kind 
also enable the test user to derive subscores for each separate part of 
the test, such as a verbal M.A., and I.Q., and a nonverbal M.A. and 
1.Q., in addition to providing him with a diagnostic profile based 
upon the student's scores on the various factors measured by the test. 
A good example of this “new” type test is the California Test of 
Mental Maturity—Advanced Battery (Grades 9—Adult), Figure 18, 
the summary page of which is reproduced here to show not only the 
various intelligence factors measured by the test, but also the pupil 
profile which is derived from the results. 

Thus, the teacher is enabled to determine the student's specific 
Strengths and weaknesses, which is much more helpful in teaching 
and guidance than to know only that Jean has an I.Q. of 116 or Sam 
has an I.Q. of 97. To know that Jean's I.Q. of 116 represents high 
verbal ability and “below average" mathematical ability and that 
Sam's I.Q. of 97 indicates "average" ability in both areas enables 


Ayunyow |pyuayw jo 159] DIu1041|D5 ays ajyoid Aupuiuins ‘gy ‘Big 


OOD =~ 
(el Ls] Ls] en ee Ot Oz S wu c OO S cp on GI N npuons tài 49 apip "apre por pi iv. 
tt ddr 
[os] [o£] 1a O Us CU Qu OL ON OH CH OEL Oz OO ~ Sea 
[os] [Du] "m Ez] -- 1NwGovw 3ovao TALNI 
[ou] [OA] [os] -mo wx Bog ---- 795v Roton 
[3] ($eEHTHMT 
[03] [0h] [os] ow [aR]? > $1012v4 39Y09NYT NON 
[o3] [os] [o9] »*» CLE 
[og] [os] [09] "w "Up S1012v4 WANIW "Y1OL 8s 
os| anm --- 5 E 
l [os] les] me [si] 9S 51422802 v&34 W101 pus 
v" d at aped 19-5) ?v101. pz 
2 = E EH 
Le sapos sequen c]. ZE 
SINJWIOYM 30Yw9 3ONIJOITHINI (r£) IVLOL E 
GJ] D) [o] = ro: ces 
ie € 
[E] een [Ebi] ..v> e+ quoi] E. 
pore: S1 “+ > > sosay jo wonojndwoyy Z EE 
[gd [hb] poe] vw MD = = + 497 puo aybry Bussuas “i Ed 


Pod 

Bony nonw noon e 
ret orao iem ^ 

VIV Io [Dl 312044 ousonovid Pa cepas 2j IYNNYW S 


By saounuoxg bee eae bas E $9311 f “2 ANY "XAYITI "MA "M “NVAITINS “L 73 A8 G3SIA30 


Sa! ae o yb UHOFS OG, mvs peoueape 
9 pond x LE 10) ios LMP EHN Jo ise], Es) 
185 (ei) CHE deca ake map CH ^" ^ WIOT-}IOYS PIUIOJI[9.) 


282 EVALUATING STUDENT PROGRESS 


the teacher and the students to make wiser and more realistic plans 
and decisions, not only about school but also about vocations and 
other things. 

In interpreting the results of the scholastic aptitude test, the 
teacher must have a genuine understanding of the terms M.A. and 
1.Q., both of which are derived from the older concept of intelligence 
tests. If Frank has an M.A. of 13 derived from his test score, it simply 
means that his mental growth has progressed to the same point as 
that of the average child of thirteen years. Or, in other words, his 
performance on the particular test resulted in a score equal to the 
average score of all thirteen-year-olds tested. Now, if Frank is also 
thirteen years old chronologically, we can say that he has reached 
the same level of performance in thirteen years of growth as the 
average of all other thirteen-year-olds. In other words, his rate of 
mental development has been "average." On the other hand, if he 
were only eleven years old, we would say that he has reached this 
point faster than the average; and if he were fifteen years old we 
would say he had taken longer than the average to reach that level. 
The actual I.Q. in each case would be calculated by dividing the 
M.A. by the C.A. (chronological age), both expressed in months, and 
multiplying the quotient by 100 (to remove any decimals or frac- 


tions). Thus, if Frank's C.A. were 13, his 1.0. would be mi x100- 


100. If his C.A. were 11, his T.Q. would be iX 100 = 118, With a 


C.A. of 15, his LQ. would be 175 X 100 = 87. Since an T.Q. of 100 is 


considered “average,” Frank's I.Q. in these three cases would be, re- 
spectively, “average,” “above average,” “below average.” Actually, 
however, an I.Q. is usually considered “average” if it is within a 
range of about 10 points above and below 100. 

No matter what the I.Q. is numerically, it must always be consid- 
ered, not as a point, but as a range on a scale of values. No test now 
in existence is accurate enough to warrant any score's being consid- 
ered as anything more than an indication of the somewhat general 
area in which the true score lies. 

As we have seen, the I.Q. represents the rate of a person’s mental 


development. It is, therefore, a mistake to interpret the LQ. as a ` 


STANDARDIZED TESTS—APPLICATION 283 


measure of the ability of the person to perform certain prescribed 
school tasks unless we also know that person’s M.A. or C.A. Two 
people may have identical I.Q.'s of 116, yet one might be three years 
old, the other thirteen. It would be ridiculous to think that the three- 
year-old child could successfully attack problems and situations of 
the type experienced by the thirteen-year-old. The M.A. is a much 
more useful score than the I.Q. for guidance of both teacher and 
student, since it represents an actual performance level rather than a 
rate of development. 

Another common mistake is to interpret similar I.Q.'s as indicating 
equal ability in all areas of scholastic endeavor, or to expect a stu- 
dent with an “average” I.Q. to do “average” work in all subject areas. 
In the first case, assume that Nancy and Jim both have I.Q.'s of 129. 
Without considering all the possible factors that might affect their 
scores, let us consider only the various factors included in the test 
itself. Jim may have high scores in the factors in which Nancy scores 
low, while Nancy may score equally high in the factors in which Jim 
scores low. Thus, their total scores could be identical, yet reflect ex- 
actly opposite abilities. The same reasoning will show that if Terry 
has an I.Q. of 125, the subscore on the various test factors may reflect 
individual abilities all the way from below average to far above 
average. Thus, her nonverbal ability could be average or even below, 
while her verbal ability might be extremely high. 

One further caution should be mentioned. Even though the “new” 
scholastic aptitude tests provide the teacher with many more scores, 
1.Q.’s and M.A.’s, than the older tests, they also have their limita- 
tions. Subtests are, of necessity, relatively short and, therefore, in- 
clude only a limited sampling of the specific factors included in each 
area. Consequently, subtest scores possess limited validity and re- 


- liability when compared with the total test scores. All subscores 


should be considered only as indicators of the presence or absence of 
a particular ability, and conclusions based thereon should be used 
with caution. 


GROUP VERSUS INDIVIDUAL TESTS 


Group measures of scholastic aptitude have certain advantages and 
certain limitations when compared with individual tests. When any 
test, no matter what the type, is administered to a group of students, 


284 EVALUATING STUDENT PROGRESS 


the purpose is to obtain information about the group, not about any 
specific individual in the group. This is true, certainly, with scho- 
lastic aptitude tests. The trouble, if any, with group tests is not the 
test, but the use which is made of the results. When a teacher or 
guidance worker administers the Primary Mental Abilities Test 
to a class of forty ninth-grade students he should realize that 
the results will give him a fairly good measure of the general level 
of ability of the class as a whole, but he had better use caution in 
accepting any individual score as a completely accurate measure of 
the abilities of a particular student. This does not mean that the test 
is invalid, unreliable, or generally inadequate as a group measure. It 
does mean that there might be serious errors in the scores indicating 
the capacity of one Joe Smith, however, for a number of reasons. 
Chief among the reasons why group tests cannot be relied upon for 
an accurate individual appraisal is the fact that Joe Smith may— 


1. Not have understood the test directions. 

2. Not be able to read at the level at which the test is written. 

3. Not be motivated to do his best work—fails to see reasons for test- 
ing, has never seen or had previous test results explained, and 
so on. 

4. Be emotionally upset during the testing because of fear, nervous- 
ness, poor rapport with the test administrator, poor experience 
with tests in the past, and the like. 

5. Be below par physically due to a cold or other illness. 

6. Be disturbed by testing conditions —working in a large group, poor 
light, inadequate ventilation, room too hot, room too cold, noise, 
interruptions, and so on. 


It is true that such conditions and factors may be true for every 
student in the group being tested, but perhaps not in the same degree 
due to the students' individual differences. Tt may also be true that 
similar conditions might exist during individual testing, but there is 
one striking and important difference in the administration of indi- 
vidual and group tests. In the individual testing situation the test 
administrator can observe the student intensively, note the effects of 
such conditions on the Student, and record his observations along 
with the quantitative test score. Thus, anyone using the score has 
the information necessary for a more accurate interpretation of its 
meaning. In the group situation, the administrator is not able to 


STANDARDIZED TESTS—APPLICATION 285 


observe each testee as intensively, and must assume that the test is 
evoking the best efforts of every student. Such, we know, is not true 
in every case, although it probably is safe to assume that it is more 
likely to be true than not true. 

The fact that group tests have limited value as measures of indi- 
vidual capacity does not mean that they have no value as tools in 
student evaluation. Their advantages may be stated to be— 


a. A rapid appraisal of large groups of students and a reasonably ac- 
curate measure of the level of capacity of a class, group, grade, or 
school. 

b. Relatively low cost per student. 

c. No need for highly or specially trained personnel to administer 
them (interpretation of results does require some special ability 
and understanding). 

d. Ease of scoring. 

e. Ready comparison possible with other groups through norms. 


The individual test, on the other hand, possesses the great advan- 
tage, previously mentioned, of providing a more valid and depend- 
able measure of the capacity of any one individual than is ordinarily 
achieved through group testing. It does have certain disadvantages 
however, because— 


a. Of its high cost per student. 
b. It is time consuming. It may require an hour or more to test one 


student. 

c. Specially trained personnel must administer the test, and score and 
interpret results. No person should attempt to administer, score, 
and/or interpret a test such as the Stanford-Binet or the Wechsler- 
Bellevue unless he has had a college-level course in individual test- 
ing, including supervised experience in administration, scoring, and 
interpreting results. 

d. Norms are usually not provided for group comparisons. 


PERFORMANCE VERSUS VERBAL TESTS 


Performance tests are measures which require the testee to perform 
in some way, usually by drawing, tracing, arranging blocks in pre- 
scribed patterns, assembling jigsaw puzzles, or doing other similar 
tasks. There is little or no need for the testee to be able to read, un- 
derstand, or comprehend written or oral communication in such tests. 


286 EVALUATING STUDENT PROGRESS 


Therefore, they are frequently used to measure the abilities of chil- 
dren and adults who have significant hearing loss (unable to hear 
oral instructions given by the examiner), language difficulty (do not 
understand the English language), or reading disabilities (unable to 
comprehend the written word). Under such circumstances the in- 
structions to the testee are given in pantomime by the examiner. 

Verbal tests, on the other hand, may be given either individually 
or in groups. Their chief distinguishing characteristic is the fact that 
the examinee, in order to obtain a score representative of his mental 
ability, must be able to read and understand the language of the test 
as well as comprehend the oral instructions given by the examiner. 

Thus, it is clear to see that the two types of instruments do not 
measure the same mental abilities. An I.Q. obtained by a perform- 
ance test and another I.Q. obtained by a verbal test for the same stu- 
dent may differ considerably because of this fact. It is, therefore, 
essential that the I.Q. always be recorded with the name of the test 
by which it was obtained in order to make possible a valid interpre- 
tation. As a matter of good practice, of course, the name of the test 
should always be recorded with any test score, no matter what type 
test was used, since each test measures some particular combination 
of abilities or skills which may vary in larger or smaller degree from 
every other test. 

It would be unfair to leave this phase of the discussion of intelli- 
gence or scholastic aptitude testing without mentioning that, in 
essence, every test is a measure of performance in that the examinee 
is required to do something, either verbally or with his hands or 
body. Therefore, it is more correct to speak of verbal and nonverbal 
performance tests than verbal and performance tests. However, com- 
mon usage has made the latter classification acceptable. 

It is also true that so-called “verbal” tests frequently measure such 
nonverbal abilities as tracing a maze; drawing a picture of a person 
or house ; identifying common characteristics in dissimilar objects, 
and so on. Thus, there are verbal tests that possess some of the ele- 
ments of the performance test. Most of the popular scholastic apti- 
tude tests include a combination of verbal and nonverbal items, the 
nonverbal items frequently including such skills as basic mathe- 
matics and spatial perception. Tests of this type usually also provide 
subscores in the various areas so that the teacher may have a measure 


STANDARDIZED TESTS—APPLICATION 287 


of the student’s abilities in the verbal as well as the nonverbal 
aspects of his general intellectual capacity. 


MISINTERPRETATIONS OF INTELLIGENCE TEST RESULTS 


As stated earlier, intelligence or scholastic aptitude tests are much 
maligned and are subjected to much abuse, primarily because the 
scores obtained by using such tests are often grossly misinterpreted. 
The misunderstandings below * were listed by seventy-nine psycholo- 
gists, competent experts on mental measurement, as the most serious 
disadvantages of giving test results to nonpsychologists. The total 
percentage of psychologists listing each factor is indicated on the 
right hand side of the page. Totals add to over 100 per cent because 
some psychologists listed more than one factor. 


Over-rating the test results; exaggerated belief in test validity, 
reliability, accuracy, constancy of the I.Q., etc. 58% 
Belief that a test measures all aspects of ability; neglect of 
separate abilities; use of I.Q. for purposes for which it is not 
intended. 55% 
Confusion in meaning of terms (I.Q., M.A., percentiles, intelli- 
gence, etc.); thinking of any test rating as an L.Q.; confu- 
sion of intelligence and information; wrong use of TO; 


applied to adults. 46% 
Tendency to go to extremes in appraising tests; they are won- 

derful or worthless. 30% 
Assumption that tests measure innate ability; that they are in- 

dependent of environment. 22% 
Other misinterpretations of what the tests measure or of their 

limitations. 19% 
Failure to interpret scores in relation to norms or to think in 

comparative terms; misuse of norms. 16% 
Under-rating the test results; exaggerated disbelief in test va- 

lidity, reliability, etc. 12% 
Failure to recognize that some tests are better than others 

(group vs. individual; limitations of verbal tests). 14% 
Too much credence given a single measurement, regardless of 

how and where the test was administered. 10% 


3 A. W. Kornhauser, “Replies of Psychologists to Several Questions on Practical 
Value of Intelligence Tests," Educational and Psychological Measurement, V (New 
York: Columbia University, Bureau of Applied Social Research, 1945), pp. 181-89. 


EVALUATING STUDENT PROGRESS 


REPRESENTATIVE MENTAL ABILITY TESTS 


In this listing only representative tests applicable to the secondary 
school are included. Numbers following the tests refer to publishers 
listed at the end of this chapter. 


Group Tests 


. Otis Quick-Scoring Mental Ability Tests: New Edition (2) 
. Otis Self-Administering Tests of Mental Ability (2) 
. American Council on Education Psychological Examination for 


High School Students (8) 


. New California Short Form Test of Mental Maturity (1) 


SRA Primary Mental Abilities Test (5) 


. Kuhlmann-Anderson Intelligence Test (7) 

. Terman-McNemar Test of Mental Ability (2) 

. Ohio State University Psychological Test (5) 

- Cooperative School and College Ability Tests (8) 
. Modified Alpha Examination Form 9 (3) 

- Lorge-Thorndike Intelligence Tests (Level 2) (6) 


Individual Tests 


Ji 


Stanford-Binet Intelligence Scale 


2. Wechsler Adult Intelligence Scale 

3. Arthur Point Scale of Performance 

4, 

5. Pintner-Paterson Scale of Performance Tests 


Goodenough Draw-A-Man Scale 


ILLUSTRATIVE ITEMS FROM INTELLIGENCE TESTS 


The following examples illustrate the types of items used in a num- 
ber of the better-known intelligence tests. 


Verbal Type Test Items 


TEST 7* 


Directions: Mark as you are told the number of the word that means 


the same or about the same as the first word. 


H blossom 1 tree ? vine 


3 flower * garden pee Stet | 


* California Short Form Test of Mental Maturity—Advanced, S Form, California 
Test Bureau, 5916 Hollywood Blvd., Los Angeles 28, California, 1950. 


STANDARDIZED TESTS—APPLICATION 289 


96. inefficient 1 avoidable ? able 
? incompetent * unruly k at TS) 
97. confiscate 1 assert ? seize EEO, 
3 compile * comfort 
120. erudite 1 crude 2 learned 
3. rugged * polite TERNOS | {0} 
121. ameliorate 1 improve 2 harden 
3 dilute 4 decorate i RE Y) 
122. malapert 1 sick 2 lazy 
3 slow * saucy KERNEN y 
123. opulence 1 jewel 2 generosity 
3 wealth * honor etd TE 117 


Arithmetic Reasoning 
TEST C5 


In Test C you are to get the answers to the examples as quickly as 
you can. Use the extra paper provided for any figuring you need to 
do. Work right down the page until time is called. Write every answer 
in the space provided on the answer sheet. 


1. How many are 20 hats and 9 hats? 
2. If you save $4 a month for 9 months, how much will you save? 


20. A manufacturer who had already supplied 1,897 dresses to a 
wholesaler delivered the remainder of his stock to 38 retailers. Of 
this remainder each retailer received 45 dresses. What was the 


total number of dresses supplied? 


Logical Reasoning 
TEST 4° 


Directions: Read each group of statements below and the conclusions 
which follow. Then mark as you are told the number of 
each answer you have decided is correct. 

5 Modified Alpha Examination, Form 9, The Psychological Corporation, 522 Fifth 


Ave., New York, N.Y. 
€ California Short Form Test of Mental Maturity—Advanced, S Form, California - 


Test Bureau, 5916 Hollywood Blvd., Los Angeles 28, California, 1950. 


290 EVALUATING STUDENT PROGRESS : 


E. All four-footed creatures are animals. 
All horses are four-footed. 
Therefore 
1 Creatures other than horses can walk. 
? AII horses can walk. 
3 All horses are animals. "uS D 


54. If the wind changes it will either grow warmer or it will storm. 
The wind does not change. 
Therefore 
1 Tt will probably grow warmer. 
? The conclusion is uncertain. 


3 Tt will not grow warmer nor will it storm. 54 


Nonverbal Test Items 


Put the right number under every drawing." 


seg ER 
Laj SCR 


7 Revised Beta Examination, The Psychological Corporation, 522 Fifth Ave., New 
York, N.Y. 


STANDARDIZED TESTS—APPLICATION 291 
TEST 5? 


In each picture draw what is left out. Work fast 


TEST 1° 


Mark tne shortest path from each arrow at the left to the opposite 
arrow at the right, but do not cross any of the lines 


Je: 


MEASURING SPECIAL APTITUDES 


The mental abilities or scholastic aptitude test measures several 
:aspects of general intellectual capacity and then combines the results 
in the different areas into a single index of ability, the T.Q. This is 
supposed to convey to the teacher an indication of the level of intel- 
lectual “brightness” of the student and his potential for success in 
his academic experiences. However, since the test measures only the 
general factors associated with mental capacity (spatial relation- 
ships, reasoning, memory, perception, word fluency, quantitative 
thinking, and so on), it is not adequate as a measure of the potential 
«of a student in prescribed subject fields, such as algebra, mechanics, 


3 Ibid. ; ? Ibid. 


292 "EVALUATING STUDENT PROGRESS 


clerical work, music, art, and the like. As a result, a large number of 
so-called aptitude tests are now available to teachers in almost every 
subject-matter area and in a great many occupational fields as well. 

Although it is quite generally agreed that aptitude is a form of — 
specific intelligence in a given area, there is apparently no common 
agreement as to a definition of the term. Warren's Dictionary of Psy- 
chology defines aptitude as “a condition or set of characteristics 
regarded as symptomatic of an individual's ability to acquire with 
training some (usually specified) knowledge, skill or set of responses — 
such as the ability to speak a language, to produce music, etc." 1° 
Remmers and Gage define aptitudes as “present traits considered as 
predictors of future achievement.” 1 Aptitude, according to Wright- - 
Stone, Justman, and Robbins "may be defined as capacities and 
abilities for a given line of endeavor, such as a particular art, school 
subject, or vocation.” 12 Greene, Jorgensen, and Gerberich state apti- 
tudes are those potentialities for success in an area of performance - 
that exist prior to direct acquaintance with that area." !? Finally, | 
Traxler says, "aptitude is a condition, a quality, or a set of qualities 
in an individual which is indicative of the probable extent to which 
he will be able to acquire under suitable training, some knowledge, 
skill, or composite of knowledge and skill, such as ability to con- | 
tribute to art or music, mechanical ability, mathematical ability, or 
ability to read and speak a foreign language. Aptitude is a present 
condition which is indicative of an individual's potentialities for the - 
future.” 14 

The authors tend to regard an aptitude as merely a tendency for 
an individual to be proficient in some area of learning. Thus, the - 
question of whether the aptitude is more an innate than an acquired — 
quality or characteristic, or vice versa, is of no practical importance. 


b sed C. Warren, Dictionary of Psychology (Boston: Houghton Mifflin Co., 1934), 
11H. H. Remmers and N. L. Gage, Educational Measurement and Evaluation j 
(Rev. ed.; New York: Harper & Brothers, 1955), p. 218. F 
J. Wayne Wrightstone, Joseph Justman, and Irving Robbins, Evaluation in 
Moder Education (New York: American Book Co., 1956), p. 334. 

, . H. A. Greene, A. N. Jorgensen, and J. R. Gerberich, Measurement and Evalua- 
inue the Secondary School, 2nd ed. (New York: Longmans, Green & Co., 1953), 
p. 31. 

14 Arthur E. Traxler, Techniques of Guidance (New York: Harper & Brothers, 
1945), p. 42. 


STANDARDIZED TESTS—APPLICATION 293 


"The fact that a student is proficient is all that matters, although it 
is well to keep in mind that whatever factors do contribute to the 
proficiency of the individual may be partly innate and partly ac- 
quired, such as physical and mental characteristics, motivational 
factors, and interests. 

Aptitudes can only be detected through achievement. In other 
words, when a student does unusually well on a test of achievement 
in arithmetic, it is possible to predict that he will be likely to con- 
tinue to do well in similar fields in the future. Thus, we have used 
a measure of past achievement to predict future achievement. What, 
then, is the difference between an aptitude test and an achievement 
test? Very little, indeed, except in terms of purpose or emphasis. In 
the achievement test the emphasis is on past success; in the aptitude 
test the emphasis is on future success. 

It is assumed in aptitude testing that the persons being tested have 
had sufficiently similar learning experiences (opportunities) so that 
any differences in scores are indicative of differences in aptitudes. It 
would be vastly unfair and illogical to give an aptitude test in al- 
gebra to a group of students who had had no mathematical training 
whatever, or an aptitude test in typing to a group who had never 
before seen a typewriter or didn’t even know the letters of the alpha- 
bet. Yet, the innate capacity of these people to learn might be very 
high. In cases like these the innate capacity must first have been in- 
fluenced by some training. In a music test, pitch discrimination is 
one factor necessary for a high score. This factor is little, if at all, 
affected by training. 

The crucial factor in most aptitude measurement is that for the 
results to be of any real value, i.e., to be valid and reliable, the 


examinee's background must be known. The teacher (or other in- 


terpreter of the test score) must know, in general, how much oppor- 
tunity the student has had to learn the skills or acquire the kinds of 
knowledge called for by the test. For example, Sam and Frank both 
score at the 20th percentile (rather low) on a test of mechanical apti- 
tude calling for the identification of a wide variety of tools and the 
recognition of parts of a number of mechanical contrivances. On the 


surface it would appear that neither boy had any particular aptitude 


for mechanics as measured by this test. A review of their cumulative 
records reveals, however, that Sam has spent all of his life living 


294 EVALUATING STUDENT PROGRESS 


in an apartment in New York City, has had absolutely no experience 


with tools of any kind, and has rarely even seen any of the types of 
mechanical gadgets pictured in the test. 

On the other hand, Frank grew up and lived his entire life in a 
small town. His father operated a garage and general repair shop 
next door to the family home. Frank spent a good deal of his time 
with his father in the shop. 

Do you think this test represented a fair measure of the mechani- 
cal aptitude possessed by Sam? Would you interpret the test results 
as indicating an equal capacity to acquire mechanical knowledge on 
the part of these boys? Both questions obviously must be answered 
in the negative. On the basis of Sam’s background we would be 


forced to conclude that he had had no chance to acquire mechanical 


knowledge. On the other hand, with his many opportunities to ob- 
Serve and handle tools and mechanical equipment, we would say that 
Frank should have acquired much more mechanical knowledge than 
he demonstrated on the test. Sam may possess a great deal of ca- 
pacity to learn mechanics, but, since he has had no opportunity to 
develop it in terms of ability, the test result for him is meaningless. 
Therefore, in Sam’s case, the test is not valid as a measure of this 
type of mechanical aptitude. 

Frank, on the other hand, has not learned much about tools or 


mechanical devices in spite of his many opportunities. We would be: 


forced to conclude, therefore, that he possessed little capacity (apti- 
tude) for this type of learning. 

A differentiation should be made between aptitude tests designed 
to predict probable success in single occupational fields and those 
made up of batteries of relatively uncorrelated tests intended to be 


valid for a variety of occupations and standardized on the same: 


populations. In the former category are tests measuring mechanical, 
motor, clerical, music, and art ability, and for predicting success in 
training for professional fields of medicine, law, engineering, nursing, 
and teaching. The second category includes aptitude factors that may 
be involved in a large number of occupations and subjects. Included 
among these factors are verbal reasoning, space relations, numerical 
operations, language usage, word fluency, eye-hand coordination, 
finger and/or manual dexterity. 

Any multifactor aptitude test will result in the development of a 


STANDARDIZED TESTS—APPLICATION 295 


“profile” for each student, thus facilitating both counseling and in- 


struction. 

One of the newer tests of this type is the Flanagan Aptitude Classi- 
fication Test (FACT), which provides for measuring ability in four- 
teen areas referred to by the manual as “facts.” The manual lists and 
describes the various “facts” as follows: 


Fact mors 
Number Name of Test escription 
1 Inspection This test measures ability to spot flaws or im- 


perfections in a series of articles quickly and 
accurately. The test was designed to measure 
the type of ability required in inspecting fin- 
ished or semifinished manufactured items. 


2 Coding This test measures speed and accuracy of 
coding typical office information. A high 
score can be obtained either by learning the 
codes quickly or by speed in performing a 
simple clerical task. 


3 Memory This test measures ability to remember the 
codes learned in test 2. 


4 Precision This test measures speed and accuracy in 
making very small circular finger movements 
with one hand and with both hands working 
together. The test samples ability to do pre- 
cision work with small objects. 


5 Assembly This test measures ability to “see” how an 
object would look when put together accord- 
ing to instructions, without having an actual 
model to work with. The test samples ability 
to visualize the appearance of an object from 
a number of separate parts. 


6 Scales This test measures speed and accuracy in 
reading scales, graphs, and charts. The test 
samples scale-reading of the type required 
in engineering and similar technical occupa- 
tions. 


296 
Fact 


EVALUATING STUDENT PROGRESS 


Number Name of Test 


y 


10 


11 


"n 


13 


14 


Coordination 


Judgment and 
Comprehension 


Arithmetic 


Patterns 


Components 


"Tables 


Mechanics 


Expression 


Description 


This test measures ability to coordinate hand 
and arm movements. It involves the ability 
to control movements in a smooth and accu- 
rate manner when these movements must be 
continually guided and readjusted in accord- 
ance with observations of their results. 


This test measures ability to read with un- 
derstanding, to reason logically, and to use 
good judgment in practical situations. 


This test measures skill in working with 
numbers—adding, subtracting, multiplying, 
and dividing. 

This test measures ability to reproduce simple 
pattern outlines in a precise and accurate 
way. Part of the test requires the ability to 
sketch a pattern as it would look if it were 
turned over, 


"This test measures ability to identify impor- 
tant component parts. The samples used are 
line drawings and blueprint sketches. It is 
believed this performance should be repre- 
sentative of ability to identify components in 
other types of complex situations. 


This test measures performance in reading 
two types of tables. The first consists entirely 
of numbers; the second contains only words 
and letters of the alphabet. 


This test measures understanding of mechani- 
cal principles and ability to analyze me- 
chanical movements. 

This test measures feeling for and knowledge 
of correct English. The test samples certain 
communication tasks involved in getting ideas 
across in writing and talking.!5 


15 Examiner's manual, Flanagan Aptitude Classification Tests (Chicago: Science 


Research Associates, 1953), p. 5. 


STANDARDIZED TESTS—APPLICATION 1 297. 


The manual points out that *each of the fourteen tests in the 
FACT series is printed in a separate booklet. These separate booklets 
permit flexibility in administrative use. The tests can be given for 
only two or three job elements, or as a complete battery.” 1° 

The examinees mark their answers in the test booklets. There are 
no separate answer sheets, but the tests are of the self-scoring type. 


'OGG6960909009009 
EEE Ee) ES] ES S] Jte s] Ee] el) 
' (9) (85,5, (9), (9), (6), €), (99), (9), KOKORO) 
foloYetororototoYoToTotoTo 1610) 
GJ uu UU AE, 
E EXE? E22 E22 E2282 2 209 
2aadadaadaddada 


o o 


(A 
RIRU UYN 835.18 3S e ME N; 


ON NO A DDR 
CASAS, 
(esee ooh SEES 


hes 
-ApADACRORER RCACRACAC RU 


Example from FACT 1 


16 Ibid., p. 4. 


298 EVALUATING STUDENT PROGRESS 


TEMPERATURE 
ss 8 uU se bà b 8 
——— — — 


B ig, 45°50 55 40 65 70 75 60 
NUMBER OF MINUTIS 
AR 
"ATI 'URVE B 
Minutes Minutes 
1 30 | 7200 s. 42.5 [1 ea n c 
2 4 | 26200 42. 25.0 O oO n otc 
$ 60 [24x00 43. 10.0 O so [s] D 
4 a7 | 390 M. 2.00 2.00 27.00 u | 
5 a jason 45. "€ DD sO 200 5 D 
6 76 | 260 C 46. B00 1600 7.00 0 D 
ho 39 | 4120 4t. 3.50 4.00 3000 o0 
& 42 | 4270 O 48. $00 300 600 400 
9. 55 | «on 4. 350 3.00 5000 00 
W. 68 |2900 E] sõ 900 000 0B | 
a. 2 | 2% E of 200 300 o0 | 
x 39 | 45 O s o0 900 «90 en 
3. 69 | 20% 0 53. oO 700 800 0D | 
M. d [sen E o0 700 1000 50 
1s 25 oO 55. oO 450 500 50 
16 40 [s] 56. D o0 12.00 5D 
M. 53. | 3870 0O GA 50O 100 “oO 00 
LC 58. 100 200 500 50 
sod 5. 300 400 500 o0 j 
Zz 6. 200 23.00 17.00 .0 O 4 
200 cop] a 70 l 
soo 350 2. n | 
[a] 700 800 | 6. 7 
o aon 22.00 fo. s 
[s] 1290 z50] 6 s H 
Q s00 55.05 6. v 
n son sonf o a I 
D 250 33.09 6 2 4 
[s] msn vonf o 2 
oo 12.5 O n] n 
0g a ü ofa wn 
4090 s a n 53 
non du D ofaa u 1 
wo w fa] n] 2 
2o u. [u] als 
23.00 235 ü n]o 5 
S790) s fa] albos Í 
geo E iea, Q Of w æ 
hs ow. ü of» s 
50 Y a QF w s 


Example from FACT 6 


STANDARDIZED TESTS—APPLICATION 299 


PARTI 


Make exact copia of the above patterns in the spores below. 


H 
H 


Example from FACT 10 


Eleven of the tests (all except 4, 7, and 10) use multiple-choice items 
and all are of the paper-and-pencil type. A few examples will serve 
to illustrate the nature of the items included in the various FACTS. 

The tests are all of the timed variety except “Judgment and Com- 
prehension” and “Expression.” 


300 EVALUATING STUDENT PROGRESS 


The manual sets forth the purpose or function of the battery as a 
“standard classification system for describing those aptitudes that 
are important for successful performance of particular occupational 
tasks. The tests were planned as an integrated battery which would 
yield a series of composite occupational scores, thus providing a 
broad basis for predicting success in various occupational fields. The 
tests have been designed for two different types of uses: 


1. They can be used for vocational counseling as an aid to prediction 


of job success on the basis of aptitudes, and as a guide for plan- 
ning a suitable program of school courses. 
2. They can be used for the selection and placement of employees.” 11 


Each test score is interpreted in terms of “stanines,” a standard 
Score on a nine-point scale, the meanings of which are shown in 
Figure 19. 


3 1 

A LITTLE 
BELOW VERY 

BELOW 
AVERAGE | AVERAGE Low 
| | 2 


Fig. 19. Meaning of stanine scores 


The counselors booklet which accompanies the FACT battery lists 
"Recommended Tests for Thirty Occupations": accountant, artist, 
biological scientist, businessman, chemist, clerk (office), dentist, 
draftsman, electrician, engineer, farmer, humanities professor, lawyer, 
machinist, mathematician, mechanic, nurse, physician, physicist, pilot 
(airplane), plumber, printer, psychologist, sales person, secretary, 
social scientist, social worker, structural worker, teacher, writer, plus 
college aptitude (general). Thus, the counselor or teacher may select 
the tests most appropriate for measuring the aptitudes related to 
specific occupations, making it unnecessary to administer the entire 
battery to every student. 

The FACT battery is accompanied by several very helpful book- 
lets—a manual which provides step by step instructions for adminis- 
tering, scoring, and interpreting the fourteen tests; a counselor’s 

17 Ibid. p. 26. 


STANDARDIZED TESTS—APPLICATION 301 


booklet, to acquaint the counselor with the uses of the FACT’s, 
along with job descriptions and recommended tests for thirty occu- 
pations; and a student's booklet, designed to help the student record 
and interpret his own FACT scores, and including the thirty job 
descriptions given in the counselor's booklet along with suggestions 
for utilizing the FACT scores as a basis for vocational planning; a 
personnel director's booklet; a technical supplement, including a 
summary of the statistical qualities of the battery and studies of the 
standardization, reliability, validity, and intercorrelations of the 
tests; and aptitude classification sheets, providing a means for com- 
puting and evaluating an individual’s job-element aptitudes and oc- 
cupational aptitudes. 

Other frequently used multiscore aptitude tests are the Differential 
Aptitude Tests (DAT), published by the Psychological Corporation, 
New York, which measure seven specific abilities; the General Apti- 
tude Test Battery of the United States Employment Service 
(GATB), published by United States Government Printing Office, 
Washington, D.C., which consists of twelve tests measuring nine 
aptitudes; and the Yale Educational Aptitude Test Battery, pub- 
lished by Educational Records Bureau, New York City, which meas- 
ures seven abilities. 


Illustrative Aptitude Test Items 
Clerical !* 


Instructions: On the inside pages there are two tests. One of the 
tests consists of pairs of names and the other of pairs 
of numbers. If the two names or the two numbers of 
a pair are exactly the same make a check mark (V) 
on the line between them; if they are different, make 
no mark on that line. When the examiner says “Stop!” 
draw a line under the last pair at which you have 


looked. 
37. 283019283745 — 283019283745 
38. 73927102 — 73927102 
39. 91029354829 — 81029354829 
40. 38291728 — 38291728 


18 Minnesota Clerical Test, The Psychological Corporation, New York 18, New 
York. 


302 EVALUATING STUDENT PROGRESS 


87. 6241526 — 6241526 

88. 1426389012 __ 1426389102 
89. 825 — 825 

90. 67253917287 — 67253917287 
101. Crane Ltd. —. Crane Co. 


102. Isaac F. Marcosson — Isaac F. Marcoson 
103. Stromberg Carlson — Stromberg Carlsen 
104. W. A. Evans — W. A. Evans 


ISI; H. J. Heinz — H. J. Hienz 

152. National City Co. — National City Co. 
153. Dorothy Gray — Dorothy Gray 
154. Reinhard Brothers Reinhart Brothers 


Mechanical '? 


5. 
Which wheel moves faster? 
A B Equal 
E 
DRIVER 
A B 
6. 
A = Which man will have to pull harder in order 
to move the load? 
A B Equal 
B 


MEASURING ACHIEVEMENT 
The achievement test is an instrument that measures the extent to 
which a person has achieved, or accomplished, something—acquired 
certain information or mastered certain skills in a given area, usually 


19 Mechanical Comprehension Test (AA), Bennett & Fry, The Psychological Cor- 
poration, New York 18, New York. 


STANDARDIZED TESTS—APPLICATION 303 


as a result of specific instruction. Thus, tests designed to determine 
the extent of a person’s accomplishments or level of achievement are 
the oldest and most common of all measures. Efforts to determine 
who is the “best” have appeared and still appear in an almost end- 
less variety of human affairs and/or activities—the jousting tourna- 
ments of the knights; the Olympic games, ancient and modern; log- 
rolling contests; boxing matches; chess matches; baseball, football, 
basketball, and other competitive games; track meets; school courses 
which end in the assigning of a grade; political races; auto races; 
business negotiations ; and speech, oratorical, and musical contests, to 
mention only a few examples in a variety of areas. These are all per- 
formance tests and in each case the competition is between two or 
more persons or contestants competing directly against each other. 
It is possible, however, in many instances to compete “against the 
clock” or some other fixed standards or criteria on the basis of which, 
then, the caliber of the performance is judged. Time trials in all 
forms of racing, standards or criteria of “excellence” in speech, music, 
and oratorical contests, as well as objectives or goals set in courses 
or classes, illustrate this indirect form of competition or testing. 

In almost every activity of life we are in a test situation, the eval- 
uators being our friends, colleagues, and ourselves, and the criteria 
being personal satisfaction, job success, and human relationships, 
among others. These illustrations are cited to show that testing—and 
particularly achievement testing—is not new nor is it unique to the 
school, 

Achievement testing as the term is used today, however, refers to 
the use of either paper-and-pencil or performance-type instruments 
to measure achievement or accomplishment in a rather narrow band 
of human skill or knowledge, usually in school-related areas. Thus, 
the teacher may measure the extent of Mike’s achievement in mathe- 
matics, history, English grammar, chemistry, Spanish, vocal or 
instrumental music, or a host of other subject matter or skill areas by 
the administration of appropriate tests. 

Since the informal or teacher-made test has been discussed thor- 
oughly in earlier chapters of this book, we shall consider here only 
the standardized achievement test which may take the form of either 
a survey or a diagnostic instrument. The survey test measures 
achievement in broad areas with emphasis upon the amount, or level, 


304 EVALUATING STUDENT PROGRESS 


of knowledge or skill achieved. The diagnostic test is designed to 
determine specific areas of weakness or deficiency in more restricted 
areas. Thus, the survey test may be said to emphasize strengths and 
the diagnostic test weaknesses. Diagnostic tests are used most in the 
elementary school to discover weakness in the basic skills (read- 
ing, arithmetic, and language). However, they do have some appli- F 
cation in the secondary school, too, especially in connection with 
special classes for the slow learners. 

Survey tests may also be classified as individual tests for specific 
subjects, such as for Latin, business education, home economics, 
music, and others, and as test batteries. The test battery is a group 
of tests standardized on the same population so that the results on 
the several tests are comparable. Test batteries have been in com- 
mon use in the elementary school for some time, but their use at the 
secondary level is somewhat limited. At the primary level the bat- 
teries are primarily concerned with measuring the level of skill 
achieved by pupils in the traditional 3 R’s—reading, writing, and 
arithmetic—the basic skills, so-called. It is relatively simple to devise 
tests to measure progress in these areas since the content and desired 
outcomes of instruction are fairly well fixed throughout the country 
and from grade level to grade level. Thus, the pupil’s strengths and 
weaknesses in each area, as well as his over-all progress, can readily 
be charted. The tests in a typical elementary achievement battery 
(usually bound in a single test booklet) include such areas as arith- 
metic (computation and problems), language usage skills, silent 
reading, science, social studies, and spelling. 

Achievement batteries at the secondary level are scarce, especially 
those in bound booklets. The batteries that do exist are mote prop- 
erly referred to as test series, since each test is printed in a separate 
booklet and measures the content and objectives found (by the test 
makers) to be most typical of the various courses all over the coun- 
try as determined by studies of textbooks, courses of study, and pro- 
fessional literature. Batteries, or series, of secondary tests usually 
cover such basic fields as English, foreign languages, mathematics, ` 
the sciences, and the social studies. There are, of course, many tests 
available for measuring achievement in a wide variety of secondary 
school subjects. Many of the newer and better achievement tests at 
both the elementary and secondary levels now combine to some ex- 


STANDARDIZED TESTS—APPLICATION 305 


tent the functions of the survey and the diagnostic instruments by 
providing total scores showing over-all levels of achievement in broad 
subject areas (French, algebra, physics), plus subscores showing the 
student’s strengths and weaknesses in the various components com- 
prising the total test. 

The teacher or administrator who is selecting a test or test battery 
for use in a secondary school should be careful to study each pro- 
posed test very carefully to make sure that it has adequate content 
(curricular) validity. Since high school courses vary considerably 
with respect to content, objectives and instructional emphases, and 
the tests reflect average, or typical, courses, this is extremely impor- 
tant. If a particular school, after careful consideration, is satisfied 
with the content, objectives, and emphases of a course, then the test 
should be selected in terms of the course, rather than selecting a test 
and then attempting to fit the course to it. 

The selection of a test or test battery should proceed according to 
these steps: 

1. Determine what you want to test for—objectives, content, em- 


phases. 

2. Make a preliminary survey of tests available in the subject area 
in question by consulting catalogs of test publishers, textbooks in 
tests and measurements, reference to Hildreth's Bibliography of 
Mental Tests and Rating Scales, Buros’ Mental Measurements 
Yearbook, and/or field representatives of the various publishers, 
as well as state departments of public instruction, many of whom 
have lists of tests available in different fields. 

3. Select the tests that appear to have the greatest promise for your 
particular purpose(s) for more intensive study. 

4. Study these tests thoroughly by— 

a. Reading the reviews of each test in Buros’ Mental Measure- 
ments Yearbook. 

b. Discussing them with publisher's field representatives. 

c. Consulting authorities—college or university professors in psy- 
chology or education. 

d. Obtaining specimen sets of each test from the publishers and 
comparing them, item by item, with the content, objectives, and 
and emphases of the course. This is the most important step in 


the entire selection process. 
e. Evaluating or appraising each test in terms of the criteria of a 


good test (see Chapter 5). 


306 EVALUATING STUDENT PROGRESS 


f. Selecting the test that rates highest on all criteria, but keep in 
mind that content validity should be the principal factor in 
selection. 

g. After administering the test, reappraising it in terms of the 
results. 


Reference was made earlier to Hildreth’s Bibliography of Mental. 
Tests and Rating Scales and Buros’ Mental Measurements Year- 
book. Both these volumes are extremely useful in the selection of 
tests for specific purposes. Hildreth lists some five thousand tests in 
all areas, classified according to name, type, and publisher. The 
teacher who would like to find a test to measure achievement in 
English, science, or social studies need but turn to the appropriate 
section in the book and find listed alphabetically all the tests pub- 
lished in that area. He can then go to the latest edition of The Mental 
Measurements Yearbook, look up the name of the particular 
test in which he is interested, and find not only a report of the sta- 
tistical and factual data about the test—forms available, suitability 
for various ages or grades, validity, reliability, cost, and so on—but 
also critical reviews of the test by a number of authorities who state 
their views of the test, giving its strong as well as its weak points. 


It would be impossible here to attempt a listing of all the various — 


achievement tests available for use in the secondary school. The list 
which follows merely presents a few of the typical achievement bat- 
teries or series and some of the individual tests available in specific 
subject fields. The publishers listed are also not all-inclusive, but are 


those who perhaps are responsible for publishing most of the tests in 
common use in the schools. 


Representative Achievement Pub- 
Batteries or Series lisher 2 Content 


California Achievement Tests — Reading vocabulary, reading 

Advanced Battery, Grades 9-14 — 1 comprehension, arithmetic 
reasoning, arithmetic fundamen- 
tals, mechanics of English and 
grammar, spelling 


Cooperative General 7 Social studies, natural sciences, 
Achievement Tests 12-13 mathematics 


20 See list of publishers, p. 319. 


STANDARDIZED TESTS—APPLICATION 


Representative Achievement 
Batteries or Series 


Essential High School Content 
Battery, Grades 10-13 


Myers-Ruch High School Prog- 
ress Test, Grades 9-12 


Evaluation and Adjustment 
Series for Secondary Schools, 
Grades 9-12 


Iowa High School Content Ex- 
amination, Grades 11-13 


Achievement Examinations for 
Secondary Schools, Grades 9-12 


Iowa Tests of Educational De- 
velopment, Grades 9-12, College 


Representative Individual 
(Subject) 
Achievement Tests 


ENGLISH 
Nelson High School English 
Test, Grades 7-12 


Pub- 


lisher 


Pub- 
lisher 


307 


Content 


Mathematics, science, social 
studies, English 


Specific tests covering areas of 
mathematics, science, English, 
literature, reading, social 
studies, health, study skills, 
listening comprehension, psy- 
chology 


English grammar and literature, 
mathematics, science, social 
studies 


English, social science, science, 
mathematics, business educa- 
tion, foreign languages 


Social concepts, natural science, 
correctness of expression, quan- 
titative thinking, interpretation 
of reading materials in the social 
Sciences and natural sciences, 
interpretation of literary mate- 
rials, general vocabulary, use of 
sources of information 


Content 


Word usage, sentence structure, 
functional grammar, punctu- 
ation 


308 


Representative Individual 
(Subject) 
Achievement Tests 


Barrett, Ryan, Schrammel Eng- 
lish Test: New Edition, Grades 
9-12 


MATHEMATICS 
Davis Test of Functional Com- 
petence in Mathematics (M), 
Grades 9-13 


Lee-Clark Arithmetic Funda- 
mentals Survey Test, Grades 
9-12 


Lankton First Year Algebra 
‘Test, Grades 9-13 


American Council Solid Geom- 
etry Test, Grades 11~13 


American Council Trigonometry 
Test (Revised), Grades 11-13 


SCIENCE 
Midwest High School Achieve- 
ment Examinations 

General Science 

Biology 

Physics 

Chemistry 


Anderson Chemistry Test, 
Grades 11-13 


Pub- 
lisher 


EVALUATING STUDENT PROGRESS 


Content 


Functional grammar, punctu- 
ation, vocabulary, pronunci- 
ation, the sentence (parts of 
speech, parts of a sentence, sen- 
tence elements) 


Consumer problems, graphs and 
tables, symbolism, equations, 
ratio, tolerance, etc. 


Twenty basic arithmetic proc- 
esses 


Algebra vocabulary, symbols, 
equations, formulas, algebraic 
fractions, radicals, ratio, graphs, 
trigonometry functions, and 
problems 


Typical geometric functions and 
problems 


Typical trigonometric functions 
and problems 


Content keyed to typical high 
school subject-matter content 


Facts and concepts, functional 
principles, scientific method, use 
of basic skills in chemistry 


STANDARDIZED TESTS—APPLICATION 309 


Representative Individual 
(Subject) 
Achievement Tests 


FOREIGN LANGUAGE 
Achievement Examinations for 
Secondary Schools, Grades 9-12 

French I and II 

German I and II 

Latin I and II 

Spanish I and IT 


Columbia Research Bureau 
French Test 
Spanish Test 

Grades 9-13 


SOCIAL SCIENCES 
California Tests in Social and 
Related Sciences, Advanced Bat- 
tery, Grades 9-12, Parts I and 
I 


Crary American History Test, 
Grades 9-13 


Cummings World History Test, 
Grades 9-13 


MISCELLANEOUS 
Watson-Glaser Critical Think- 
ing Appraisal (M), Grades 9— 
adult 


Diagnostic Tests of Achievement 
in Music 


Content 


Tests keyed to usual high school 
content 


Tests keyed to usual high school 
content 


Creating a new nation; nation- 
alism, sectionalism and conflict; 
emergence of modern America; 
U.S. in transition (since 1918) 


Historical facts, historical proc- 
esses, ability to interpret histori- 
cal data, maps, etc. 


Knowledge and understanding 
of great movements and social 
trends that have taken place in 
development of civilization 


Measures five aspects of ability 
to think critically 


Mastery of theory and skills in- 
volved in reading music and as 
a background for a sound music 
education 3 


310 EVALUATING STUDENT PROGRESS 


DETERMINING INTERESTS 


Interests are expressions of freely chosen activities usually asso- 
ciated with basic human needs or drives, and resulting in spontane- 
ous pleasure or satisfaction. Since interests are closely tied in with 
needs, they vary with needs; and since needs vary with age, sex, 
environment, and physical and mental characteristics, it is clear that 
interests are influenced by a complex pattern of factors. This com- 
plex interrelationship of personal and environmental factors makes 
it very difficult to “measure” interests on any standard quantitative 
scale, since such a scale would, in all probability, not apply in the 
same way to each person. 

Therefore, about the best that can be done in the way of deter- 
mining interests at the present time is to use a number of different 
techniques and tools, none of which can properly be termed a test. 
In any case, the best the teacher can do is to obtain evidence of the 
presence of interests by— 


1, Observing what the student does during his free time, both in and 

out of school. 

2. Noting the elective courses the student takes, 

3. Obtaining statements, oral or written, of interests from the student. 

4. Observing, or obtaining other evidence of, the type of reading the 

student does. 

5. Having the student respond to interest inventories of various 

kinds, vocational and/or personal. 

Since we are concerned in this chapter with the more formal, 
structured measuring devices, we shall discuss only the last of the 
above techniques for appraising or canvassing interests, the inven- 
tory. On the secondary level most interest inventories are concerned 
with specific occupations or vocational fields. 

The Occupational Interest Inventory (California Test Bureau), 
for example, “measures” interests in six major occupational areas: 
Personal-Social (domestic service, social service, teaching) ; Natural 
(farming and ranching; fish, game, and domestic fowl) ; Mechanical 
(maintenance, machine operation, designing); Business (clerical 
spelling, buying, management and control); The Arts (artcrafts, 
musical performance, painting, and drawing); The Sciences (ap- 
plied chemistry, biological research, assistant in scientific work). 
This instrument uses the paired-comparisons technique of presenting 


, 


STANDARDIZED TESTS—APPLICATION 311 


pairs of descriptions of unrelated activities. The student is instructed 
that he must select one of each pair of activities even though he may 
have no great interest in either. Examples of these items follow: 


Raise chickens, ducks, or turkeys and sell them. 

. Arrange a display of watches, rings, and other jewelry in a store 
window. 

c. Keep receipts or other records in order. 

d. Collect rocks, crystals, or other earth formations. 

e 

f 


oP 


. Bake bread, pies, cakes, or rolls. 
. Measure the depth of oceans and the flow of ocean currents. 

In addition to determining a student’s preference for different 
occupational fields, this instrument also provides a means for iden- 
tifying three types of interests—verbal, manipulative, and computa- 
tional—and a measure of level of interest. According to the manual, 
“the student learns whether he enjoys occupations requiring simple 
routine and unskilled activities or occupations involving originality, 
inventiveness and professional skill.” 

Another inventory of a somewhat different type is the series of 
Kuder Preference Records (Science Research Associates), which 
measure interests in broad classes of occupations (Kuder Preference 
Record— Vocational, Form C) ; in specific occupations (Kuder Pref- 
erence Record—Occupational, Form D); and in different types of 
personal and social activities (Kuder Preference Record— Personal, 
Form A). 

The vocational record measures ten broad areas of vocational and 
educational interest: outdoor, mechanical, computational, scientific, 
persuasive, artistic, literary, musical, social service, and clerical. The 
test contains 168 groups of three activities each. The students mark 
the activity they prefer most and the one they like least in each 
group. 

The occupational record measures the student's interest in specific 
occupations, rather than in broad areas. At present it consists of 
one hundred items covering twelve occupations, but this number is 
being added to constantly. 

The personal record measures preference for five different types of 
personal and social activities: working with ideas, being active in 
groups, avoiding conflicts, directing others, and being in familiar and 
stable situations. 


312 EVALUATING STUDENT PROGRESS 
Examples of items in each record are presented below: 


Vocational 
Student indicates which of the three activities in each group he likes 
MOST and LEAST. 
. Tinker with a broken sewing machine 
Play a piano 
Sketch an interesting scene 
. Sell vegetables 
. Be an organist 
Raise vegetables 


Nh Hom 


Occupational 


A. Have a good assortment of art supplies 
B. Have a garden 
C. Have a workbench and tools 


U. Write advertising 
V. Be in charge of a public library 
W. Publish a newspaper 


Personal 

n. Go to see a fire 
Go to see an accident in which people have been hurt 
Go to see a famous person riding along the street 


"Track down criminals 
Be in charge of a prison 
Conduct studies to find out how criminals think 


can oD 


The Strong Vocational Interest Blank (The Psychological Cor- 
poration) is available in two forms, Form M (for men) and Form W 
(for women). Form M measures interests in forty-seven occupations 
for six groups of occupations and for four special variables (Interest 
maturity, Occupational level, Specialization level, and Masculinity- 
Femininity). Form W measures interest in twenty-eight occupations 
plus Masculinity-Femininity. Although hand-scoring blanks are 
available for both forms, hand scoring is a very time-consuming 
operation, especially when the test is scored for several occupations. 
If it is to be scored for all the occupations (which usually isn’t nec- 
essary), or even any substantial number of occupations in addition 
to the other factors and variables, it is highly desirable to arrange 


STANDARDIZED TESTS—APPLICATION ^ 313 


for machine scoring. Although the cost of machine scoring is rather 
high, the results in certain instances are well worth the money spent, 
since a very complete analysis of the interest patterns of the indi- 
vidual is obtained, which is very helpful in difficult vocational coun- 
seling cases. 

Examples of items from this instrument follow : ?* 


Part |. Occupations. Indicate after each occupation listed below whether 
you would like that kind of work or not. Disregard considerations of 
salary, social standing, future advancement, etc. Consider only whether 
or not you would like to do what is involved in the occupation. You are 
not asked if you would take up the occupation permanently, but merely 
whether or not you would enjoy that kind of work, regardless of any 
necessary skills, abilities, or training which you may or may not possess. 
Draw a circle around L if you like that kind of work 
Draw a circle around I if you are indifferent to that kind of work 
Draw a circle around D if you dislike that kind of work 
Work rapidly. Your first impressions are desired here. Answer all the 
items. Many of the seemingly trivial and irrelevant items are very useful 
in diagnosing your real attitude. 


1 Actor (not movie) L I D 46 Jeweler Tip 
2 Advertiser ...... I D 47 Judge I XD 
3 Architect ......- L I D 48 Labor Arbitrator .... L LAB 
4 Army Offier .... L I D 49 Laboratory Technician L I D 
S Artist cs ape ds L I D 50 Landscape Gardener. . L DI 

Part IV. Activities. Indicate your interests as in Part I. 

186 Repairing a clock ........ Lp 

187 Adjusting a carburetor ........- Doni ou 

188 Repairing electrical wiring. ..... T9351 0D: 

189 Cabinetmaking ......-----+-- SE eee a) 

190 Operating machinery ....------ CED 


Part V. Peculiarities of People. Record your first im- 
pression. Do not think of various possibilities or of 
exceptional cases. “Let yourself go” and record the 
feeling that comes to mind as you read the item. 

234 Progressive people.....--.----- CAP SENATE) 
235 Conservative people. .......- Iu lp 


21 Vocational Interest Blank for Men (Revised), Form M, Edward K. Strong, Jr., 
Stanford University Press, Stanford University, Stanford, California. 


314 EVALUATING STUDENT PROGRESS 


Although almost every major test publisher has some type of in- 
terest test, the three discussed here are probably the most widely 
used at the secondary school level and are representative of all others. 

In measuring vocational interests, the technique employed in 
standardization is to catalog or describe the constellation of interests 
of persons who are already engaged in various occupations or voca- 
tional fields and then to compare their preferences of activities with 
those of the persons whose interests are being measured. If, for ex- 
ample, a student’s preferences of activities are similar to those of 
workers in outdoor occupations, he is described as having interests 
related to outdoor occupations, or he is said to have outdoor interests. 

One rather serious limitation of all interest test scores which the 
teacher must consider in interpreting results for any student is the 
fact that the score does not necessarily reflect intensity of interest. 
Two students may have identical interest patterns, but one’s inter- 
ests may be much more intense than the other’s. It is well to remem- 
ber in the over-all evaluation of students that measured interests and 
other interest indications serve as checks on each other. Both are 
needed. 


“MEASURING” PERSONALITY AND SOCIAL ADJUSTMENT 


As with interests, it is very difficult to “measure” the personal 
qualities of individuals. This difficulty stems from several sources, 
chief among which are— 


1. The complex nature of human behavior. 


2. The variation in personal behavior characteristics from person to 
person. 


3. The fluctuation in behavior patterns from situation to situation for 
the same person. * 


4. The fact that there is no one desirable pattern of behavior or per- 
sonal traits which fits all persons. 


In addition to such fundamental problems as these, there are still 
other difficulties involved in measuring with the so-called paper-and- 
pencil type tests: 


1. There is a question as to whether the examinee is, or can be, honest 


in responding to questions which may appear to him to damage 
his self-concept. 


STANDARDIZED TESTS—APPLICATION 315 


2. The tests merely survey a sampling of behaviors or characteristics, 
but do not succeed in measuring the seriousness of the various 
problems. A single problem checked by one examinee may be the 
cause of more difficulty for him than twenty problems checked by 
someone else; yet the number of checks usually is taken as the 
index of adjustment or maladjustment. 

3. There-is frequently a great discrepancy between what a person 
says he does or would do on a test and what he actually does. 


Personality tests, or adjustment inventories, may be classified 
roughly into two categories : 


1, The paper-and-pencil type (which may be used with any size 
group). 

2. The projective type (usually used in testing on an individual 
basis). 


Since the formal projective devices are not used much on the sec- 
ondary level except in severe clinical cases, we shall refer to them 
only briefly later in this chapter. 

Most paper-and-pencil type personality tests consist of lists of 
questions or statements to which the student responds in terms of his 
attitude, feeling, or behavior (actual or imagined). The following 
items from the California Test of Personality will serve to illustrate 
this point : ?? 

61. Is it hard for you to talk to classmates of the opposite sex? 
81. Do your eyes hurt often? 

117. Do you prefer to have parties at your own home? 

134. Do you feel that some people deserve to be hurt? 


The components of this test will also serve to illustrate the manner 
in which tests of this type tend to describe total personality or ad- 
justment in terms of various elements or factors: 


1. Personal adjustment 
Self-reliance 

. Sense of personal worth 
Sense of personal freedom 
. Feeling of belonging 
Withdrawing tendencies 
Nervous symptoms 


BOO n> 


22 California Test of Personality, California Test Bureau, Los Angeles. 


316 EVALUATING STUDENT PROGRESS 


2. Social adjustment 

Social standards 
Social skills 
Anti-social tendencies 
. Family relations 
School relations 
Community relations 


uBOOWP 


A separate score is obtained for each component as well as for 
‘each of the two major areas and for the total test. 

Tn spite of a great deal of criticism of paper-and-pencil personality 
tests, they do represent a tool which can be of value to the teacher 
and the counselor in an over-all evaluation of student progress in a 
very important area of his development—personal and social adjust- 
ment. As is true with any instrument or tool, it must be used intelli- 
gently, with full knowledge of its idiosyncrasies, limitations, and 
advantages if it is to yield its maximum benefits. The following list 
-of cautions and suggestions is presented to aid the teacher-counselor 
in getting the most out of the personality (adjustment) test: 


1. If it is used as a screening device given to large groups of students, 
those with unsatisfactory scores should be followed up preferably 
by an interview to learn the reasons for the low score. On the 
other hand, it must be remembered that every student who attains 
a satisfactory score is not necessarily well adjusted—he may have 
“faked” the answers on the test in order to make himself “look 
good." 

.2. The students must want to take the test, understand the reasons 
for taking it, and know what use is to be made of the results if 
any confidence is to be placed in the results. 

3. The examiner must establish the very best possible rapport with 
the students to insure a high degree of cooperation on their part 
and maximum honesty in responding. 

-4, The best results are achieved if the test is given individually as a 
part of the counseling process when the student is willing and 


anxious to be tested, in fact, has requested that a personality test 


be given. 

.5. It is desirable to use the test as a basis for an interview with the 
student, and to discuss every item which the student has marked 
contrary to the expected response, even though his total score may 
mark him as "satisfactory or ‘“‘well-adjusted.” 


' STANDARDIZED TESTS—APPLICATION 317 


6. It should be used in conjunction with other personality appraisal 
techniques (interview, anecdotal records, autobiographies, socio- 
grams, and so on) and not accepted as the whole answer to person- 
ality evaluation. It is not the only tool, but it is a tool; therefore, 
it should be used but with common sense and understanding. 


ILLUSTRATIVE PERSONALITY AND ADJUSTMENT TEST ITEMS 


In the following items taken from the A-S Reaction Study? the 
student checks the answer which seems to be most typical of his 
usual reactions. 


1. In witnessing a game of football or baseball in a crowd, have you 
intentionally made remarks (witty, encouraging, disparaging, or 
otherwise) which were clearly audible to those around you? 


frequently 
occasionally 
never 


2. a) Ata reception or tea do you seek to meet the important person 
present? 


usually 
occasionally .—— — —— 


never RE gaua 
b) Do you feel reluctant to meet him? 


yes, usually 
sometimes 
no 


The items below from the Bell Adjustment Inventory are answered 
by drawing a circle around the “yes,” “no,” or “?” ed 


51a Yes No ? Do youfeelthat your friends have happier home en- , 
vironments than you? 

52 Yes No ? Do you often hesitate to speak out in a group lest 
you say and do the wrong thing? 

5» Yes No ? Do you have difficulty in getting rid of a cold? 

54 Yes No ? Do ideas often run through your head so that you 
cannot sleep? 


23 4.S Reaction Study, Houghton Mifflin Co., Boston. 
24 The Adjustment Inventory (Adult Form), Hugh M. Bell, Stanford University. 


Press, Stanford, California. 


318 EVALUATING STUDENT PROGRESS 


Another type of instrument in this category (appraisal of personal 
qualities and adjustment) is the problem checklist, best illustrated 
by the Mooney Problem Check List and the SRA Youth Inventory. 
Both of these are lists of the problems that young people worry 
about. The problems are classified in such categories as “My school,” 
“Boy meets girl,” “My home and family,” and so on. The purpose of 
these instruments is to help the teacher-counselor to identify the 
problem areas which are of concern to the students as a basis for 
group guidance or, in some cases, individual counseling. In general, 
the same cautions and suggestions apply to these checklists as apply 
to the personality tests mentioned previously. 

Reference was made in Chapter 11 to another type of person- 
ality appraisal device, the projective test. Although there are many 
projective techniques which can be used in studying students, 
we will refer here to only two—the Thematic Apperception Test 
(TAT) and the Rorschach Test. Both devices are fairly easy to 
administer, but the interpretation of the results is a highly tech- 
nical process requiring special training and a great deal of experience. 
As mentioned earlier, they are clinical tools and not designed for use 
by the teacher or even the counselor in most instances, 

The TAT consists of thirty-one picture cards providing two series 
of ten each for boys, girls, men, and women. The examinee is asked 
to tell a story about each picture—what has happened, what is hap- 
pening, and what will happen. On the basis of the stories the ex- 
aminer is able to determine some of the drives, emotions, conflicts, 
and sentiments of the examinee, Since there are no right or wrong 
answers, the examinee is not able to “fake” his responses; he can 
only project himself into the picture and respond in terms of his own 
experiences and feelings. 

The Rorschach technique consists of showing the examinee a series 
of ten inkblots, He is asked to tell what each blot reminds him of or 
looks like. The interpretation of the responses provides clues as to 
the person's personality traits, feelings, fears, and the like, It is now 
possible to obtain the inkblots reproduced on Kodaslides for projec- 
tion on a screen for large group testing. In this case, the subject 
(examinee) is provided with an answer sheet (multiple-choice) on 
which he records his answer by underlining the one response in the 
group for each inkblot which he believes best describes that particu- 


STANDARDIZED TESTS—APPLICATION 319 


lar blot. Again, as with the TAT, the interpretation of the responses 
requires special training and experience. 

In summary it may be said that appraisal of personality develop- 
ment is essential in a complete evaluation program in today’s school. 
Granted that the present instruments for measuring growth in this 
important area are far from perfect, it does no good merely to com- 
plain. The instruments we have are the best so far developed. Let’s 
use them intelligently, but at the same time be constantly alert for 
new and improved methods and techniques. 


List of Test Publishers Referred to in This Chapter 


(1) California Test Bureau, (5) Science Research Associates, 


(2) 


(3) 


(4) 


110 South Dickinson Street, 
Madison 3, Wisconson 


5916 Hollywood Boulevard, 
Los Angeles 28, California 


World Book Company, 
313 Park Hill Ave., 
Yonkers 5, N. Y. 


Psychological Corporation, 
522 Fifth Avenue, 
New York 36, New York 


Educational Test Bureau, 
Educational Publishers, Inc., 
720 Wash. Ave., S.E., 
Minneapolis, Minnesota 


2106 Pierce Ave., 
Nashville, Tennessee 


3433 Walnut Street, 
Philadelphia, Pennsylvania 


(6) 


(7) 


(8) 


57 West Grand Avenue, 
Chicago 10, Illinois 


Houghton Mifflin Company, 
2 Park Street, 
Boston 7, Massachusetts 


432 Fourth Avenue, 
New York 16, New York 


2500 Prairie Avenue, 
Chicago 16, Illinois 


715 Browder Street, 
Dallas 1, Texas 


777 California Avenue, 
Palo Alto, California 


Cooperative Test Division, 
Educational Testing Service, 
Princeton, New Jersey 


Stanford University Press, 
Stanford University, 
Stanford, California 


CHAPTER 
15 


Interpretation of Test Scores 


Wuar po THE scores on the test mean? How does my first-period 
class compare with my fifth-period class? How much have my stu- 
dents actually gained over the past four months? What does it mean 
when the test manual reads, “The standard deviation for the total 
test was 15”? Is it fair to use national norms to study the achieve- 
ment of my class? Do the statistics really prove that success in read- 
ing is related directly to the diet of students? Why do I have to 
study these scores? These and many other similar questions are 
raised by teachers as they seek to understand the meaning of the 
scores that their students achieve on tests. Unless teachers do under- 
stand how to interpret test results, the efforts made to study students 
are almost valueless. 

If the teacher will remember that two major purposes for using a 
test or any other evaluation technique are to discover ways and 
means of improving the instructional process and to guide individual 
progress, then the need for understanding the meaning of scores be- 
comes evident. Test results become information to be studied and 
interpreted for the benefit of the student and the teacher, not 
numbers to be recorded in a daily record book or on the cumulative 
record folder. Grade norms, means, standard deviations, correlations, 
and percentile ranks should become meaningful concepts because 
they are related to the teaching-learning process; they are not merely 
abstract concepts that belong to research workers or test specialists. 
‘The purpose of this chapter is to assist the teacher to understand 


320 


INTERPRETATION OF TEST SCORES 327 


the functional relationship of score interpretation to the teaching- 
learning process. 

What does a score mean? Johnny got a 65 on a science test, Mary 
received a 90 on an English test, and Susan scored a 47 on a history 
test. The scores of 65, 90, and 47 are actually meaningless unless a 
great deal more is known about the test in which these scores were 
received. The different meanings of a single test score can be easily 
illustrated. 


Test #1 Test #2 Test #3 
Student's score 75 75 75 
Average score for class 82 64 76 
High score for class 97 89 98 
Low score for class 61 57 62 


Tn the above illustration, if a student received a score of 75 on each: 
of three tests, the score of 75 does not necessarily mean the same 
for each test but may vary from test to test. In the first test, the- 
score of 75 indicates a position below the average of the class; in the 
second test the same numerical score would place the student well 
above the average score; and in the third test the score of 75 places. 
the student in the average range. Even in this simple example it can 
be seen that, to understand test scores, it is at least necessary to: 
know the average and the extreme scores. 

When a teacher gives a test he frequently has some idea of the 
scoring pattern that he will use. It is possible when giving a test to 
give one point for each item, or it is possible to weight each item so: 
that one question gets ten points, another five, and still another one 
point. Frequently, teachers decide to establish their scoring pattern 
on the basis of 100 points. Each question is given a value and a 
“perfect” paper receives 100 and an arbitrary point, such as 70 or 75, 
is selected as the failing-passing mark. 

Once the teacher has decided on his scoring pattern and the papers 
have been scored, he simply takes the papers and arranges them in 
rank order from highest to lowest. 

The only difficulty in ranking papers is that there will often be 
two or more papers with the same score. In the following illustration 
three papers had scores of 62. To rank the papers it becomes a simple: 


322 EVALUATING STUDENT PROGRESS 


Raw Scores on Arrangement of Papers Rank Order of 

Algebra Papers from Highest to Lowest Papers 
78 87 1 
83 84 2 
62 83 3 
87 79 4 
51 78 5 
43 15 é 
52 73 7 
56 71 8 
59 67 9 
57 66 10 
62 65 11 
62 64 12 
67 63 13 
65 62 15 
64 62 15 
63 62 15 
71 61 17 
58 59 18 
66 58 19 
75 57 20 
73 56 21 
79 52 22 
61 51 23.5 
51 51 23.5 
84 43 25 


matter of saying, ^These three papers would normally rank 14, 15, 
and 16. Since they represent the same scores, we will find the average 
of the ranks and assign each paper this average, or a rank of 15." 
The same process is involved in ranking all other scores as is shown 
in the above illustration. 

While ranking as described above gives the position of a score 
in a group of scores, it is frequently desirable to determine the per- 
centage of scores below or above a particular score. The process of 
finding the percentage of scores above or below a specific score is 
known as finding the percentile rank of a score. Many standardized 
test norms are based upon an analysis of the percentile ranks of the - 
standardization group. The process of securing percentile ranks will 
be described later. 

Having ranked the scores, the teacher can now determine the range 
of the scores by subtracting the lowest from the highest score. In the 


INTERPRETATION OF TEST SCORES 323 


example above, the highest score is 87 and the lowest score is 43, 
resulting in a range of 44 points. This information is helpful for com- 
parative purposes if other tests have been scored on the same basis. 


CENTRAL TENDENCY 


One of the most common means of describing a set of scores or 
other data is to find a single score that represents the scores for an 
entire class or group. The average is a number that represents a 
group of numbers but may not represent any single item of the 
group. In the following example, the average salary of the teachers in 
the Broadview Elementary School is $3,957.50, but not one member 
of the group received that salary. 


Salaries Paid to the Teachers of the Broadview 
Elementary School, 1955-1956 


$ 4,750 

4,500 

4,400 

4,200 

as 3,975 

Woodworth 3,975 

Hazelton ... 3,850 

Bishop ... 3,425 

Hawkins . 3,400 

Jackson 3,300 

Total see $39,575 
Average (Mean) «++ $ 3,957.50 


The concept “average” is used in reporting many different patterns 
of data, but it must always be remembered that the average may not 
accurately describe the characteristics of any one individual. By a 
study of the average it is possible to tell how the individual differs 
generally from the group. It is a useful concept because it permits 
the teacher to gain a better picture of a total classroom group, al- 
though it has its limitations when studying individual behavior. 

It is necessary, however, to think of the concept of average in 
three different, although related, ways. The average used most com- 
monly is referred to by statisticians as the mean. The mean is the 
arithmetic total of all scores divided by the number of scores. 


324 EVALUATING STUDENT PROGRESS 


Problem: What is the mean (average) of the following scores? 


Test Scores 


Number of Scores 8) 479. Total of Scores 
59.8 —Mean (Average) 


The difficulties of using the mean as an average become evident 
when an extreme high or low score is part of the data being con- 
sidered. To overcome this difficulty, it has been found desirable to 
make use of the concept of the “median.” The median is that point 
above which 50 per cent of the scores lie and below which 50 per 
cent of the scores lie. 


Problem: What is the mean and median of the following set of scores? 


Test Scores Test Scores 

98 98 

82 82 

79 79 

69 69 
Mid-score (Median) ——3»54 54 
53 53 

51 51 

49 49 

42 42 
9)577 


64.1—Mean (Average) 


The difference between the mean and median when extreme scores 
are present is evident in the above example. When the scores are 
added together and then divided by the total number of scores, a 
mean of 64.1 is obtained. In this case, the midscore, 54, is identical 
with the median. The one extreme high score of 98 raises the mean 
10 points over the median. 


TIS 


INTERPRETATION OF TEST SCORES 325 


In working with most sets of data, especially when large numbers 
of scores are involved, the difference between the mean and the 
median is not great. It is always important to study the original data 
to see if it will be influenced by extreme scores. Should there be ex- 
treme high or extreme low scores, the median should be reported. 
Extreme high and low scores in the same set of data tend to cancel 
one another out and the mean can be computed and used. The median 
in most cases is an adequate measure of central tendency and it can - 
be located easily. If a question exists, a good procedure to follow is 
to report both the mean and the median. 

The mode is also identified as a measure of central tendency, but 
one that is used only rarely. It is the score or value which occurs 
most frequently in a set of scores or distribution. Since it is a very 
crude approximation of central tendency, either the mean or the 
median should be used in most cases. 


FREQUENCY DISTRIBUTION 


The data that have been used for illustrative purposes up to this 
point have been relatively simple. When handling the scores fora 
single classroom group, it is fairly simple to compute the mean by 
adding the scores and dividing by the total number of scores. When, 
however, it is necessary to deal with several hundred scores, the 
computation by the addition process becomes cumbersome and it is 
necessary to find other ways to handle the data more efficiently. This 
can be facilitated by preparing a frequency distribution. 

A frequency distribution is the arrangement of data (scores) in 
an organized system of classes or intervals. When arranging data 
in a frequency distribution, the main item to be considered is the 
size or limits of each interval or class. While there are no exacting 
rules that can be followed in determining these limits, it is common 
practice to: (1) find the extreme scores in the distribution, (2) sub- 
tract the lowest score from the highest score to determine the range, 
and (3) divide the range by seven and then by fifteen to determine 
the largest and the smallest class interval practical for the dis- 
tribution. Since the numbers five and ten are handled easily, many 
frequency distributions will be grouped in classes or intervals of five 
or ten points. 


326 EVALUATING STUDENT PROGRESS 


Problem: Establish the class interval size for a frequency distribution 
in which the range of scores is from 74 to 23. 


High score 74 Dividing 51 by 7 results in a class interval size of 7; divid- 
Low score 23 ing by 15 gives an interval size of 3. Therefore, a class inter- 
Range Rie val size of 5 would probably be the easiest to handle and 


should be selected for use. 


Problem: Establish the class interval size for a frequency distribution 
in which the range of scores is from 126 to 32. 


High score 126 Dividing 94 by 7 results in a class interval size of 13; divid- 
Low score 32 ing by 15 gives an interval size of 6. Therefore, a class in- 
Range ^94 terval size of 10 would probably be the easiest to handle. 


It is common practice when constructing a frequency distribution 
to have the highest class interval at the top of the table and the 
lowest class interval at the bottom of the table. It is also common 
practice to place the lowest value in each class interval in the left- 
hand column, as shown in the illustrations below. 


Table 1 


Final test scores in spelling for the sixth grade at 
a sample school, 1957 


Number of Students 
Class Intervals Making Scores in 
Each Interval 
90599 iN csiss pete e crate alate DEEP TRIES 7 
80-89 9 
70-79 12 
60-69 17 
50-59 14 
40-49 9 
30-39 6 
20-29 4 
10-1959 as oret deis E RED E E 2 


In setting the size of the class limits one other convention is used 
that is most important. While scores for a spelling test can be dis- 
tributed conveniently as shown in Table I, problems frequently 
do arrive about the exact limits of each class interval. For the in- 


INTERPRETATION OF TEST SCORES 327 


terpretation of test scores as presented in this book, it will be as- 
sumed that the exact limits for each interval are .5 below the lowest 
score and .5 above the highest score in the interval. This is illus- 
trated by the two examples below: 


89.5 90 94.5 99 99.5 
79.5 ` 80 84.5 89 89.5 
Lower Limit Center Point Upper Limit 
of (Midpoint) of 
Interval of Interval 
Interval 


The setting up of the frequency distribution intervals and the 
tallying of each score is a relatively easy matter but it must be done 
carefully if the results of any analysis of the data are to be accurate. 


Problem: Prepare a frequency distribution of the test scores received 
on a final examination in statistics given to 60 seniors. 


Scores on A TEST 
(Raw Scores) 


95 90 98 75 83 80 
60 87 99 80 85 88 
75 85 75 92 84 86 
82 84 63 88 85 84 
94 80 83 80 80 83 
78 76 87 82 90 80 
58 65 80 78 88 86 
80 78 84 75 68 81 
72 81 85 77 72 75 
93 55 7 90 70 78 


Frequency DISTRIBUTION OF Test SCORES 


Raw Scores Number of Students 


328 EVALUATING STUDENT PROGRESS 


Work Guide for Computing Mean from a Frequency Distribution 


Frequency Times 
Deviation Deviation 
( 


4) (fd) 
4 12 
3 18 
2 22 
1 19+ 71 
0 0 
-— -3 
—2 —6 
-3 —6 
—4 —8 — 23 
F 48 = X (summation of) fd 
Formula for mean of grouped data: 
w= au e (S33) i M = Mean 
A.M. = Assumed mean 


X = the sum of 

fd = frequency times deviation 

N = total number of cases (students) 
i = size of the class interval 

Step #1—Select the class interval in which you think the mean occurs. 
In this example the interval 75-79 was selected. Any interval 
could have been selected. 

Step #2—Set the selected interval equal to zero in the deviation col- 
umn, Then note the number of class intervals that deviate 
plus or minus from the selected (A.M.) interval. Reading up 
toward the higher scores is plus, reading down toward the 
lower scores is minus. 

Step #+#3—Compute the “fd” column by multiplying the number of 
scores in each interval by the appropriate deviation, i.e., à 
frequency of 3 times a deviation of 4 gives a frequency-devi- 
ation value of 12. 

Step #4—Secure the sum of the frequency-deviation column as shown 


above. 
Step 3£5—Substitute in the formula and complete the arithmetic 


A.M.-— 7? (The assumed mean is 
the center point of the 


class interval.) m 48 
fd = +48 Mean = 77 + (#)s 
iat 240 
N= 60 =7+ (2) 


=7 +4 
=79 


— -—" 9 


INTERPRETATION OF TEST SCORES 329 
Work Guide for Computing the Median from a Frequency Distribution 
Score Number of 
Intervals Students 


Interval containing median 


21 scores below 80 (79.5) 


To find the median for the above data the procedure is as follows: 


Step #1—Divide the total number of cases by 2. In the example, this 
would be 60 = 2 = 30. 

Step #¢2—Then add the frequencies, beginning with the bottom one, 
until you reach the class interval containing the number you 
observed in Step #1. In the example the median is in the 
class interval 80-84. 

Step #:3—Determine how many cases you need of those in the interval 
containing the median to give you the number secured in 
Step #1, N/2 = 30. The number that you need to complete 
N/2 forms the numerator of a fraction, and the total number 
of frequencies in the interval containing the median forms the 
denominator of the fraction. In the example, there are 21 
cases (students) up to the 80-84 interval, therefore 9 more 
cases in 80-84 interval would be needed to get 30, The frac- 
tion that would be formed is 9/19 since you need 9 of the 
19 cases in the 80-84 interval to arrive at the 30th score 
which would represent the half-way point in the distribution. 

Step 3:4—Multiply the fraction of Step 3£3 by the size of the class in- 
terval. In the example this would be 


(2): ia 


‘Step 3:5—Add the result obtained in Step 3£4 to the lower limit of the 
interval containing the median. The result is the median. The 
lower limit of the class interval 80-84 is 79.5. To this would 
be added the number found in Step #4. à 

Median = 79.5 + 24 = 81.9 


330 EVALUATING STUDENT PROGRESS 


VARIABILITY OF SCORES 


Up to this point, scores have been described in terms of range, 
rank, percentile rank, mean, median, and frequency distribution. All 
of these characteristics of a group of test scores help the teacher to 
interpret individual as well as group scores. In order to describe 
test scores fully, however, it is necessary to understand how dif- 
ferent sets of scores vary. 

In an earlier section of this chapter it was suggested that the limits 
of scores for a test could be found by identifying the highest and 
lowest scores. If all scores for all tests were distributed in an even 
or typical pattern, then knowing the extremes would help us to 
describe adequately the pattern of scores. Test data do not, how- 
ever, always distribute themselves in a typical pattern. The graphs 
shown in Figure 20 show three different tests with the same extreme 
scores but with very different distributions. 

In Classes A, B, and C the scores all range from 35 to 69. How- 
ever, the scores are distributed very differently in each case. In 
Class A the scores are distributed so that the greatest number of 
scores is centered in the 50-54 interval and there is a tapering off of 
the scores toward the extremes, Class B is different from A in that 
most scores are concentrated in the 50-54 interval but the grouping 
of the scores drops away toward the extremes very sharply. In Class 
C a very uneven distribution exists, although the range is still the 
same as in the other two classes. 

Because an analysis of test results must take into account the dif- 
ferences in the variability of test distributions, it is necessary to 
identify two other major methods for describing a set of scores. The 
standard deviation and the quartile deviation permit an analysis of 
the variability of test scores. The quartile deviation is related di- 
rectly to the concept of the median and percentile ranks, while the 
standard deviation is related to the concept of the mean and the 
normal distribution curve. 

To understand the concept of quartile deviation it is necessary to 
imagine that a set of scores is divided into equal parts. If the scores 
are divided into two equal parts, we know from the earlier discus- 
sion that the center point would be the median. This is true, since by 


1 


INTERPRETATION OF TEST SCORES 331 


definition the median is that point which divides a distribution into 
two equal parts. 

If the scores are divided into four equal parts, the lowest 25 per 
cent of the scores are classified as being below the first quartile, the 
next 50 per cent of the scores are in the interquartile range, and the 


Class A 


E ou 


35-39 40-44 45-49 50-54 — 55-59 — 60-64 65-69 Test Scores 
Closs B 


3539 4044 4549 5054 5559 60-64 6569 Tost Scores 


Fig. 20. Three different distributions of similar test scores 


332 EVALUATING STUDENT PROGRESS 


top 25 per cent of the scores are above the third quartile. This can 
be illustrated as follows : 


Interquartile Range 
25% of Scores |<—~ 50% of the Scores —>| 25% of Scores 


Lowest y Highest 
Score | j | Score 
Qı Q2 Qs 
50% of Scores Median 50% of Scores 
I—- mer E 


The analysis of a set of scores by quartiles enables the teacher to 
see quickly the spread of scores for different classes. The computa- 
tion of Qı and Qs proceeds in the same way as for the median, as 
shown on page 324. To locate Q; it is necessary to find the point that 
represents the 25th percentile and Qs by locating the point corre- 
sponding to the 75th percentile. 


Problem: What is the median, Qı, and Qs for the two distributions of 
test scores shown below? 


Scuoor A Scuoor B 
Raw Number of Raw Number of 
Score Students Score Students 
90-99 1 1 
80-89 . 2 2 
70-79 . 5 5 
60-69 . 20 6 
50-59 . 16 7 
40-49 . 4 10 
30-39 5 4 
20-29 2 3 
10-19 H 2 
56 ^40. 
DNE ENS NES NA 
aie 28 Wiring 14 STi 20 mu 10 
Median = 59.5 + (+) 10 = 5950 ^ Median = 49.5 + (+) 10 = 50.93 
Q: = 49.5 + (4 10 = 50.75 Q = 39.5 + (4) 10 = 40.50 


Qa = 59.5 + (#) 10 = 66.50 Qs = 59.5 + (+) 10 = 60.16 


INTERPRETATION OF TEST SCORES 333 


The most common method used by statisticians and educational 
research workers to describe the spread (variability) of a set of 
scores is to compute the standard deviation, which is a measure of 
the spread of scores for a normal distribution. Standard deviation is 
a distance measured above and below the mean of a distribution. To 
understand the concept of standard deviation it is necessary to un- 
derstand what is meant by normal distribution. 

Normal distribution is a concept that can be explained by a mathe- 
matical consideration of the laws of probability or it can be ex- 
plained by studying the distribution of many different sets of data. 
(See Fig. 21). 

The curve illustrates how, for a large group of scores, there is a 
concentration around the average score. Now while the shape of this 
curve may be true for a large group, it is not necessarily true for 
small groups. In a large school system the reading achievement scores 
for all eighth-grade students might resemble the bell-shaped curve 
that is shown in Figure 21, but each eighth grade classroom group 
might vary considerably from this pattern. 

The danger in interpreting scores for individual classroom use is 
that the teacher will assume that a small group of students fits the 
pattern of normal distribution which exists for large groups. A still 
greater problem exists when an effort is made to place labels of 
good, average, and poor on distributions because of the existence of 
a normal or near normal distribution. Normal distribution is not a 
substitute for standards or good teaching but a mathematical de- 
vice for studying a group of scores. 

While the existence of normal distributions can be illustrated by 
many different kinds of data, the actual construction of a normal 
distribution curve is a mathematical function. Mathematicians have 
established a theoretical normal distribution curve that can serve 
as the basis for understanding distributions that are closely related 
to the theoretical distribution. Teachers need to remember that theo- 
retical formulations, such as Einstein's Theory of Relativity, are 
instrumental in much of the practical research that has been and is 
taking place in atomic physics. 

The theoretical normal distribution curve is a bell-shaped curve. 
The line dividing the curve into two equal parts is the mean of the 
data represented by the curve. While the mathematics involved in 


334 EVALUATING STUDENT PROGRESS 


NUMBER 
OF GIRLS 
70 


60 
50 
40 


30 


5 35 65 95 125 155 
DISTANCE IN FEET 
Normal Curve Fitted to Data of Baseball Throws for Distance by First- 
Year High School Girls. 


NUMBER OF 
PLAYERS 
105 


Li 
240 .260 .280 .300 .320 .340 .360 .380 


BATTING AVERAGE 


Normal Curve Fitted to Batting Averages of 379 Major and Minor 
League Baseball Players. 


180 .200 .220 


7.400 


Fig. 21. Illustrations of fitted normal curves 


INTERPRETATION OF TEST SCORES 335 


the process are not simple, it would be possible to measure the area 
underneath the curve much as we might measure the area of a rec- 
tangle. 

The percentage of the area under the curve at various distances 
from the mean has been computed, so that an individual knowing 
the standard deviation can compute the area included under the 
curve. 


Distance from Mean Percentage of Distance from Mean Percentage oj 
in Standard Deviations Total Area in Standard Deviations Total Area 
0.1 Standard Deviation 03.98 2.0 Standard Deviation 4772 
0.2 * s 07.93 2.1 y K 48.21 
0.3 s L4 11.79 2.2 4€ s 48.61 
04 “ « 15.54 23 E i 48.93 
24 “ s 49.18 
0.5 s 8 19.15 
06 « " 22.57 28 « « 49.38 
0.7 “ “ 25.80 2.6 W č 49.53 
08 « « 28.81 24 « " 49.65 
0.9 is it 31.59 28 ^ f 49.74 
2.9 5 49.81 
1.0 Standard Deviation 34.13 3.0 Standard Deviation 49.87 
14 A e 36.43 This information can be read as follows: 
1.2 ie us 38.49 At a distance of 1.0 standard deviations 
1.3 H rà 40.32 from the mean an area of approximately 
14 (U ne 41.92 34 per cent of the curve is covered, or 
that 34 per cent of the cases are included 
15 st re 43.32 in the area from the mean to +1.0 
16 S c 44.52 standard deviations. This information 
17 jt 45.54 should be related to the illustration of 
1.8 st it 46.41 the normal distribution curve, shown on 
19 « « 4743 page 338. 


Once the standard deviation for a set of scores has been secured, 
it is possible to describe accurately the spread or variability of the 
Scores. If we know that a set of scores has a mean of 50 and a stand- 
ard deviation of 10, we can say that approximately 68 per cent of the 
Scores are between 40 and 60, that 16 per cent of the scores are above 
60, and that 16 per cent of the scores are below 40. It is also pos- 
Sible by using the concepts of standard deviation and normal dis- 
tribution to compare different sets of scores that may have the same 
or different means and the same or different standard deviations. 


' 336 EVALUATING STUDENT PROGRESS 


Problem: How do the scores of students on three different tests vary? 
Which group of students on each test has the lowest 16 per 
cent of the scores? 


Standard —1.0 Standard 
Ui Deviation Deviation 
Test #1 45 7 38 
Test #2 75 15 60 
Test #3 60 10 50 


Interpretation: The students on Test #2 showed the greatest spread 
of achievement while Test #1 revealed the least spread. Students with 
scores of below 38 on the first test, 60 on the second test, and 50 on the 
third test were in the lowest 16 per cent of their group on each respective 
test. 


STANDARD SCORES 


To facilitate the process of comparing scores from different tests, 
statisticians have devised the concept of the “standard score." While 
there are several different types of standard scores that are used by 
test publishers to present data about test results, the two most com- 
mon ones are the z-score and the T-score. The z-score is secured by 
the following formula: 


raw score — mean 


Z-SCOIe = — —d1.-3 3: 3 -— 
standard deviation 


The T-score formula is basically the same except that it sets the 
mean for a normal distribution curve equal to 50 and the standard 
deviation equal to 10. 


10 (raw score — mean) 
T- = OnB AS TID pam 
ERAT standard deviation 


Problem: Find the standard scores (z-score) for the score of 45 when 
the mean for the set of data from which the scores were drawn 
is 50 and the standard deviation is 10. 


raw score — mean 


z-score = E 
standard deviation 


AS i150; ess 
10 10 


m 


INTERPRETATION OF TEST SCORES 337 


Problem: A teacher received the following report for five students after 
they had taken a standardized reading test: 


T-scores 
John 45 
Richard .. 65 The T-scores were computed for all 
Thomas .. 60 of the students in the tenth grade 
"Steve .. 40 of the local school system. 


Allan .......- 80 


Interpretation: Since a T-score is based upon a distribution where the 
mean is 50 and the standard deviation is 10, the scores indicate that 
Allan is at the average for the group, Richard and Thomas are in the 
upper 16 per cent of the group, and John and Steve are in the lowest 16 
per cent of the group. Only 2 per cent of the students are lower than 
Steve. 


The relationships between the normal distribution curve, standard 
deviation, percentiles, and typical standard scores are presented in 
Figure 22. It can be seen from Figure 22 that a student with a raw 
score that brings him to the 16th percentile of a group would have 
a z-score of — 1.0 and a T-score of 40. All of these scores when 
related to the normal distribution curve indicate that the student is 
in the bottom 16 per cent of his group for that particular test. Figure 
22 is most helpful when it becomes necessary to analyze different 
types of reported score results. 


NORMS 


The concepts of normal distribution, percentile rank, and standard 
score are useful in attempting to understand the development of 
norms for standardized tests. Norms have been developed by test 
makers to assist test users to understand more fully the relationship 
of one student’s score to the scores made by others. A norm, whether 
it be an age norm, grade norm, or percentile norm, is an average 
score for students of a specific grade or age. It is not a standard of 
performance but merely a measure of the expected achievement of 
a typical group. 

When a test maker decides to construct a test, a great deal of re- 
search will enter into the various phases of test construction before 
the publishers will print and distribute the test for general use. It 


338 EVALUATING STUDENT PROGRESS 


pepe 2u% 
MEE 
Standord 
Deviations -4g te -20 -le LJ +o +20 +36 +40 


tame 

TOT gets! Neri ue 

oreet [ s 10 120130 40 50 60 70180); 90 % p 
EES] PE ] is LE EARN, 


-30 -20 -10 o +10 420 +30 +40 
acy 
70 30 40 LJ a 70 LJ 
CEEB scores 
200 300 400 500 00 800 
Ver LJ V Lr 1 LU 
| T Í T il T iL | 
Sterns CEETEESILERI BB E 
‘Per cont in stoning 4% 7% 12$ v% TR w% 12% 7% | 1% | | 
senem Hera Í l 
‘Subtest: L———————Ó——— es ee) 
1 4 7 LJ wn LJ 1 
Ciia S gd dy 
55 70 85 100 us 130 145 


Fig. 22. Relationship of various types of scores to the normal distribu- 
tion curve (From The Test Service Bulletin, No. 48 (January, 1955). 
Reproduced by permission of the Psychological Corporation) 


will be necessary to determine the types of items to use, the extent 
to which the items actually measure what the test maker wants to 
measure, and whether or not the test will actually discriminate 
sufficiently so that the test users will be able to study the differences 
between students. New forms of tests will be tried out with many 
different student groups in many different schools and in many 
different communities. After repeated tryouts, the test maker will 
secure a pattern of scores such as that illustrated in Table 2. 

Age norms are developed by ascertaining the average score made 
by students of a specific age group; grade norms are developed by 
determining the average score made by students at Specific grade 


INTERPRETATION OF TEST SCORES 339 ` 


Table 2 
MAT DIFFERENTIAL INTELLIGENCE NORMS — GRADES 10, 11, & 12 — MALES* 
Ša | s5 INDIVIDUAL TEST SCORES 
dH 
Lud 3. 
HUE 


3 


2 


BBs) ya 


“R 


[leas ol cases! NOITLON, ION Koran: EN ION -—— 
telligance 
finteltigance tost data were convaried to sten scores 


12v 05 Average, t-s: 
FTO found in grad 103 ie proupa, in tera 


340 EVALUATING STUDENT PROGRESS 


levels; and percentile norms are made by computing the percentile 
| ranks for students within an age or grade grouping. 

When using tables of norms, the teacher must be sure that the 
norms are based upon a representative sample of students, that the 
subparts as well as the total test are statistically reliable, and that 
the local student group tested is similar in all characteristics to the 
students with which the test norms were developed. 

Since there is usually a significantly high positive correlation be- 
tween one’s mental ability and his achievement, some test publishers 
attempt to take this into consideration by preparing norms for 
groups with different levels of mental ability. This is done to assist 
the test user in using the test results for diagnostic purposes. The 
MAT Differential Norms, Table 2, are an example of such norms. 


RELATIONSHIP OF SCORES 


It is frequently desirable in education to discover the extent of 
the relationship that exists between two or more sets of scores. 
Teachers want to know whether or not students who do well on a 
test of general ability should be expected to do well in algebra; the 
freshman counselor wants to know whether the scores on a general 
test of achievement are indicative of the success a student will have 
in high school; and the remedial reading teacher may want to know 
the extent of relationship between success in reading in an English 
class and reading in a science class. Answers to these questions can 
in part be obtained by the technique known as correlation analysis. 


To understand the process of correlation it is helpful to review 
some elementary principles of graphing. If a point is placed in the 
grid it can be located precisely by stating that it is at the intersection 
of two lines. In the grid above, the point is at (B-5), the intersection 
of the vertical alphabetical lines and the horizontal numerical lines. 
It is possible to plot points on the grid such as C-4, B-2, and E-1 by 


INTERPRETATION OF TEST SCORES 341 


reading along the horizontal and vertical lines to the points of inter- 
section. 

This brief review of graph construction is helpful when trying to 
correlate two sets of scores on a scattergram. A scattergram is a 
diagrammatic representation of the concept of correlation, as shown 
in Figure 23. 


© 0 20 30 40 SO eo 70 80 90 10 uo 


Fig. 23. Scattergram illustrations of the relationship between 
two variables 


While it is possible to illustrate diagrammatically the relationship 
between two sets of scores, the scattergram is not sufficiently ac- 
curate to allow for a careful analysis of the data. To overcome this 
difficulty, statisticians have devised the index known as the correla- 
tion coefficient. Reliability and validity coefficients were introduced 


342 EVALUATING STUDENT PROGRESS 


to the reader in Chapter 5. They are both forms of the correlation 
coefficient. 

The correlation coefficient may range along a scale from +-1.0 
(perfect positive relationship) through 0.0 (no relationship) to —1.0 
(perfect negative relationship). Perfect relationships of +1.0 or 
—1.0 are seldom found. 


+1.0 EN 
0.8 il very high correlation 
0.5 | substantial correlation 
0.3 f some correlation 
0.2 { slight correlation 
0— 1 practically no correlation 
—0.2 slight correlation 
—0.3 { some correlation 
—0.5 { substantial correlation 
—0.8 { very high correlation 
~1.0 


The meanings of correlation coefficients of various sizes are illus- 
trated in Figure 23. Where sets of scores are highly related, the 
scattergram indicates that the scores are grouped so that they fit a 
straight line drawn diagonally from the point where the vertical 
and horizontal axes intersect. Where the relationship is not very high, 


INTERPRETATION OF TEST SCORES 343 


the scores are scattered very widely. It needs to be understood that 
a high relationship between two quantities does not necessarily indi- 
cate cause and effect relationship between the variables. High grades 
in English may be secured by the student who receives high grades 
in Latin but the cause for the high grades may actually be high 
interest in the fields of language or high mental ability or some other 
common element affecting success in both fields. 

The correlation coefficient needs to be studied carefully for sig- 
nificance, for a specific value is influenced by the size of the sample 
from which it is secured and its predictive value has a noticeable 
margin of error. For example, if a correlation coefficient of --.35 
between success on an arithmetic skills test and algebra is secured 
on a sample of thirty-five students, the results would not be nearly 
as significant as if one thousand students were studied. The size of 
the sample and the population are important factors to be con- 
sidered when attempting to interpret correlation coefficients. 


Table 3 


Predictive efficiency of coefficients of correlation of 
varying magnitude 


3 ^. | Chances in 100 of Predicting 
Correlation Percentage Increase Beas osc ch ope and ation below 
Coc ficient Predictive Efficiency Average in Future Behavior 
0.0 50-50 
0.5 50.25-49.75 
2.0 51-49 
5.0 52.5-47.5 
8.0 54-46 
13.0 56.5-43.5 
20.0 60-40 
29.0 64.5-35.5 
40.0 70-30 
56.0 78-22 
69.0 84.5-15.5 
80.0 90-10 


Of great importance to teachers and counselors is the predictive 
value of the correlations that have been secured. For example, if 
the correlation coefficient between success on an English test and a 


344 EVALUATING STUDENT PROGRESS 


reading test is +.80, the predictive efficiency is 40 per cent (see 
Table 3), or the chances are about three to one that the success of 
students in one field can be predicted if their degree of success in the 
other is known. It is possible to make predictions on the basis of 
correlation data, but it must be re-emphasized that, until the corre- 
lation coefficients are +.60 or better, the chances of making accu- 
rate predictions are very slight. Even when the correlations are 
up to +-.90 it is not possible to say with certainty which one out of 
three students will succeed in one area because he succeeded in an- 
other area. The teacher must use reported correlations with caution 
because they generally reflect group patterns rather than the be- 
havior patterns of individual students. 

Measurement is subject to error, and various techniques have been 
developed by statisticians to account for mathematical errors that 
result from sampling problems and the limitations of statistical ' 
analysis. While a detailed analysis of the concepts of standard error 
are beyond the scope of this introductory account of test interpre- 
tation, it is important for the teacher using test results to recognize 
that a score received by a student on a test must be subjected to 
the scrutiny of logic as well as statistical analysis. It is hoped that 
this introductory material will encourage teachers to develop their 
statistical literacy so that they will be willing to approach the prob- 
lem of test analysis with more understanding. 


CHAPTER 
16 


Diagnosis from the Results of 
Measurement 


Test rEsuLts, anecdotal record data, sociometric information, and 
all other data that can be secured from the various techniques de- 
scribed earlier are relatively valueless unless an effort is made to 
use them and to discover the relationship which exists between them. 
Too often test results become isolated bits of information that are 
seldom used to improve instruction, modify the educational program, 
or guide students. One of the primary responsibilities of the teacher 
is to secure a balanced picture of the student so that proper em- 
phasis can be placed on the various aspects of pupil development. 

Studies of human behavior cannot be considered as relatively sim- 
ple procedures, for individual growth patterns do not fit into pre- 
viously prepared molds. The classroom teacher seeking to under- 
stand student behavior must recognize four forces that determine 
the actions of the individual, as shown in Figure 24. These forces 
are; (1) the individual himself, (2) his family, (3) his friends, and 
(4) the broader environment in which the individual, his family, 
and his friends move. Learning and achievement are dependent upon 
the manner in which these forces operate. Thus, it is necessary that 
the teacher understand this process if successful evaluation is to take 
place. For purposes of description and analysis it is possible to seg- 
ment an individual, but it must be remembered that all of the ele- 
ments to be described below are closely interrelated. 

Each individual is a composite of abilities, interests, attitudes, 


345 


346 EVALUATING STUDENT PROGRESS 


physical development, and emotions. The student who is doing poor 
work in his English class (compared with other students) may be 
doing as well as his mental ability will allow. The good student in 
art may be doing well because of his natural aptitude and not be- 


FRIENDS FAMILY 
Attitudes Attitudes 
Interests Interests 
Activities Abilities 


Abilities 


INDIVIDUAL 
| BEHAVIOR | 


SELF 
ENVIRONMEN’ 
Mental Ability Home is 
Academic Ability Community 


Physical and 
Mental Health ines 
Attitudes 
Interests 
Aptitudes 
Social Ability 


Fig. 24. Forces operating as determinants of student behavior 


cause of his effort. The average student may be doing exceptionally 
well in French considering his lack of interest. The average student 
in the American history classroom might be just average because it 
is fashionable among his friends to be average. To understand why 
the good student is good, the average student is average, and the 


———" 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 347 


poor student is poor, the teacher should study his individual charac- 
teristics: (1) mental ability, (2) academic ability and achievement, 
(3) physical and mental health, (4) interests, (5) attitudes, (6) 
social ability, and (7) aptitudes. 

In Chapter 14, it was pointed out that much confusion over the use 
of mental ability tests still exists because of the persistent belief 
that a mental ability test score is an absolute score. While it was 
clearly shown that this is not the case, it still remains important to 
recognize that such a score may be an index of student capacity. 
The student with low-average mental ability who is working up to 
his capacity should not be expected to achieve the success of the 
student with high mental ability. Students, however, who do have 
the mental ability but who are not working effectively need to be 
studied to determine why they are operating below capacity. 

Since much of what is done in the schools is dependent upon the 
academic skills of reading, writing, and speaking, it is important for 
the teacher to know rather precisely how the student rates in each of 
these skill areas. The failure in science may be a reading problem 
and not a problem for the science teacher. Lack of ability in written 
communication may be the cause of student problems in the business 
management class as well as in the English class. Students who lack 
the basic skills needed for success in the classroom might readily 
become the behavior problems, the disinterested, and the early drop- 
outs. Knowing the existence of deficiencies, the teacher of algebra, 
history, French, home economics, or salesmanship is in a better posi- 
tion to help the student than is the teacher who isn’t aware of such 
problems. 

The physical and mental health of the individual is now recognized 
as a determinant of student behavior. Poor diet, eye and ear dis- 
abilities, and obesity are among the many causes of student prob- 
lems. Thus, the classroom teacher is faced with the necessity of recog- 
nizing the impact of health factors upon the well-being of the stu- 
dent. In recent years psychologists have helped teachers become 
aware of the importance of mental health in the classroom, as they 
have carefully documented the concept that the emotionally dis- 
turbed individual tends to be in poor condition to respond to the 
average classroom routine. Problems of security, recognition, and 
affection are no longer outside the province of the classroom, for 


348 EVALUATING STUDENT PROGRESS 


emotional instability may create as many, if not more, problems for 
the teacher as will poor eyesight. 

The illustrations with reference to mental ability, academic abil- 
ity, and physical and mental health cited above have been designed 
to illustrate the influence that many factors have upon individual 
behavior. These examples, however, do not illustrate the role which 
is played by the external forces of family, friends, and environment. 
Most teachers readily accept, not necessarily with approval, the fact 
that teenagers are subject to fads. This semester all the girls wear 
white ankle socks, next semester all the colors of the rainbow are 
visible. Sloppy attire runs rampant and then it is replaced by the ` 
party dress vogue. Such is the course of existence of the secondary 
school student. The influence of the fad also extends to other areas. 
In some circles it isn’t fashionable to be a “grind” and the average 
grade is considered acceptable. Students decide on the basis of in- 
dividual teachers how much production is necessary to “get by” just 
as factory workers establish their work load quotas. Ineptness on the 
part of students needs to be considered a group problem as well as 
an individual problem. 

It should be readily apparent that the teacher attempting to study 
student behavior must be aware of the multiplicity of factors that 
are in operation in the ordinary classroom. Student failure may be 
the result of factors operating within the student, factors created by 
the home situation, or factors within the classroom. To diagnose 
student behavior requires the same skill as that of the physician as 
he seeks to determine the causes of aches and pains. Just as the 
physician must make use of many diagnostic techniques, so too must 
the teacher use a variety of techniques if appropriate treatment is 
to be prescribed. 


USING TEST BATTERIES FOR DIAGNOSTIC PURPOSES 


All of the methods, techniques, and devices that have been de- 
scribed in the preceding chapters are useful in securing data that can 
be used to diagnose individual and group strengths and weaknesses. 
Standardized test batteries, however, represent one constellation of 
techniques that requires additional consideration. This is so because, 
with all of their limitations, they still are the best available means - 
of securing information about large groups of students. It is also 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 349 


necessary to pay special attention to these batteries because teachers 
are called upon to use and interpret the data of the standardized 
testing programs found in most of the schools. 

Cook has analyzed the function of diagnostic tests and has pointed 
out some of their weaknesses as well as their strengths. 


The more general achievement test batteries which yield scores in 
vocabulary, reading comprehension, arithmetic reasoning, arithmetic 
computation, etc., have limited value in the planning of instruction for 
specific pupils. The major functions of such comprehensive batteries may 
be summarized briefly as follows: 


1. To direct curriculum emphasis by: i 
a. Focusing attention on as many of the important ultimate objec- 
tives of education as possible. 
b. Clarifying of educational objectives to teachers and pupils. 
c. Determining elements of strength and weaknesses in the in- 
structional program of the school. 
d. Discovering inadequacies in curriculum content and organiza- 
tion. 
2. To provide for educational guidance of pupils by: 
a. Providing a basis for predicting individual pupil achievement in 
each learning area. 
b. Serving as a basis for the preliminary grouping of pupils in each 
learning area. 
c. Discovering special aptitudes and disabilities. 
d. Determining the difficulty of material a pupil can read with 
profit. 
e. Determining the level of problem-solving ability in various 
areas. 
3. To stimulate the learning activities of pupils by: 
a. Enabling pupils to think of their achievements in objective 
terms. 
b. Giving pupils satisfaction for the progress they make, rather 
than for the relative level of achievement they make. 
c. Enabling pupils to compete with their past performance record. 
d. Measuring achievement objectively in terms of accepted edu- 
cational standards, rather than by the subjective appraisal of 
teachers. 
4. To direct and motivate administrative and supervisory efforts by: 
a. Enabling teachers to discover the areas in which they need 
supervisory aid. 


350 EVALUATING STUDENT PROGRESS 


b. Affording the administrative and supervisory staff an over-all 
measure of the effectiveness of the school organization and of 
the prevailing administration and supervisory policies. 


However, such achievement test batteries are too general to be used as 
a basis for instruction even when detailed analysis of items is made; al- 
though such an analysis has value and is certainly not to be discouraged. 
The sampling of items is too limited and the organization too gross for 
such tests to be considered as adequate guides in the planning and 
directing of educational experiences for individual pupils.* 


In considering the use of a standardized test battery the teacher 
should be in a position to analyze with considerable care the instru- 
ment that is to be used. Until a test or test battery is critically 
analyzed by those who are going to use it, the test should not be 
used in a school. Frequently, promotional plans of test publishers are 
influential in encouraging school systems to install a testing program 
before the teachers, counselors, and others entrusted with the re- 
sponsibility for administering the program are given the opportunity 
to screen the tests. In many school systems teachers are expected 
to use the results of a testing program without a full awareness of 
the significance of what they are expected to do. 

To study a test or test battery critically it is necessary to ascertain 
the following information: 


I. General facts 
Title, author, publisher, designated function 
II. Significance 
Does it measure a trait or ability which is significant for educa- 
tion? 
III. Validity 
Does it measure what it purports to measure? 
IV. Reliability 
How accurately and consistently does the test measure? 
V. Diagnostic value 
Does the test give information as to why children fail, what their 
special difficulties are, etc.? 


1 Walter W. Cook, “The Functions of Measurement in the Facilitation of Learn- 
ing,” Educational Measurement (Washington, D.C.: American Council on Educa- 
tion, 1951), pp. 36-37. 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 351 


VI. Norms 
What types are available? Are they representative? What is 
their range? 

VII. Administration 
Is the test easy to give and score? How much time is required? 

VIII. Pupil performance 

What does the pupil do? What dimensions (speed, quality, diffi- 
culty) are measured? How are the scores expressed? 

IX. Construction of the test 
How were the exercises selected? How was the placing of the 

` exercises carried out? 
X. Manual 

Does it contain complete and easily intelligible directions? Does 
it give norms? Does it suggest uses to which the test results might 
be put? Does it give data about the test? 

XI. Costs 
The cost should be considered in relation to the dependability 
and usefulness of the results. 

XII. Mechanical considerations 
What are the good and bad points about the typography, make-up, 
packing, size, and complexity of the test, etc.? 


Much of the information that is needed to make an effective and 
efficient appraisal of tests or test batteries can be found by a thor- 
ough study of the test and the test manual. In recent years, en- 
couraged by the standards proposed by the American Psychological 
Association, the American Educational Research Association, the 
National Conference on Measurements Used in Education, and their 
own professional staffs, test publishers have extended the coverage 
of their manuals to include detailed analysis of the process of secur- 
ing validity, reliability, and norms. The manuals of the more re- 
cently published tests also provide detailed suggestions on how the 
tests might be used. 

Once a decision has been reached by a school system to use a par- 
ticular test or test battery, the major problem for the teacher then 
becomes one of interpretation. Here the teacher faces two problems: 
How to interpret the test results for individuals, and how to interpret: 
the results for a total classroom group. While the problems are inter- 
related, they have unique characteristics. For example, it may not 
be a serious classroom problem if one individual in an English class 


352 EVALUATING STUDENT PROGRESS 
Cooperative School and College Ability Tests 


irri nti 


tesornretations Scores profiled here are bande rather Ifthe bands of the student's verbal and quantitative 


points. The midpoint of each band shows ap- scares overlap, there is probably no important differ. 
previa what pereetage of udens in the mre eaer between he scores If the two bands do not overo 
int group earned scores lower than the one profiled lap, the chances are about S-to-! that there is a real 

band covers two standard errors of measure. difference in measured ability present, (See Manual 


oe one above and one below the percentile rank for additional information on interpretation.) 
score earned. This means that the chances are 240-1 

that the student's "true" score lies within the range 

of the band. 


BE FW Other Tests Administered 
pm Gop tog. C. C 


Fig. 25. Sample profile, Cooperative School and College Ability Test 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 353 


is weak in language usage but for the individual with college-oriented 
goals it may be a very serious problem. On the other hand, if all of 
the students are weak in dictionary skills, the classroom conse- 
quences may be much greater than the individual consequences. 

In using test batteries the teacher or counselor is seeking infor- 
mation that may be of some help in guiding students, measuring 
progress, or improving the teaching-learning situation. The test 
battery, depending upon the instruments that are used, may be able 
to offer gross area indications of individual or group weaknesses or 
strengths, or it may be used to secure information about specific 
weaknesses or strengths. The examples of profiles that are presented 
in Figures 25, 26, and 27 illustrate some of the numerous methods 
used by test publishers to assist teachers to diagnose student ability 
and progress. 

The Cooperative School and College Ability Test score profile, 
Figure 25, illustrates a number of good principles of score presenta- 
tion. From the sample profile it can be seen that the student, Richard 
Roe, had a percentile rank of 37 on the verbal portion of the test 
(sentence completion tasks and vocabulary tasks) and a percentile 
rank of 62 on the quantitative portion of the test (numerical compu- 
tation tasks and numerical problem-solving tasks). The publishers 
of the Cooperative Test make use of the concept of “a band of pos- 
sible test scores” to illustrate the fact that the chances are two to 
one that Richard's true score is within a band whose limits are set at 
“two standard errors of measurement.” For example, Richard’s true 
score on the quantitative portion of the test is somewhere on a band ' 
between a percentile rank of 50 and a percentile rank of 75. 

The shaded portion of each segment of the profile represents the 
middle portion of the standardized norm group. According to the 
test publishers, if the bands for each test do not overlap, the chances 
are five to one that there is a real difference in the measured 
abilities. 

The analysis made by the test publishers of this sample profile 
illustrates the way teachers can make their own analyses of profiles: 


1. All three of his scores on the SCAT battery and both of his scores 
on the other two tests fall in the middle half of the percentile rank scales 
(the lightly shaded sections between the 25th and 75th percentiles) —in- 


354 EVALUATING STUDENT PROGRESS 


dicating that Richard probably is neither very gifted nor really retarded 
in these measured abilities when he is compared with other J 1th-graders 
generally. 

2. His quantitative score is considerably higher than his verbal score— 
sufficiently higher so that the confidence intervals of the two scores do 
not overlap. This means that the chances are about 5-to-1 that a true 
difference exists and that Richard really is better in the quantitative 
skills measured by this test than he is in the measured verbal skills. 
(Note that in this case the difference in the percentile ranks of the two 
scores is 25 points, yet the confidence intervals just barely fail to over- 
lap.) 

3. Although Richard’s quantitative score is significantly higher than 
his verbal score, it still falls in the range of the middle half of the stu- 
dents in his grade and does not indicate any notable superiority in the 
quantitative skills measured. 

4. Richard's total score places him very near the middle of the range 
for 11th graders. If his verbal skills of the kind measured by this test 
could be improved to the level of his quantitative ability, his total score 
probably would move into the “upper half” of the grade group. 

5. Richard took the Cooperative Reading Comprehension Test, Higher 
Level, Form Y, less than a week after he took the ability tests. The pro- 
file shows that his teacher or counselor marked in the percentile rank 
of his score according to the Grade 11 norms and also the confidence in- 
terval for the score, which he obtained in this way: 


a. Finding the standard error of measurement for the test [it’s given 

` somewhere in the test materials by most publishers], he located the 
Scores which were one standard error above and one standard error 
below the score Richard earned. 

b. Looking at the norms table for the test, he found the percentile 
rank values of these two scores and marked them on the profile. 

C. When he shaded the area between his marks above and below the 
score in the profile column, he defined a confidence interval for 
that score which closely resembled the confidence intervals for the 
ability test scores and could be compared with them. 


Although Richard's score on the Cooperative Reading Test has a slightly 
higher percentile rank value than does his verbal score on the SCAT, 
the confidence intervals of the two scores overlap on the profile. These 
two scores, then, can be interpreted as if they were the same and one 
can say that Richard's reading comprehension test results agree with his 
verbal ability test results. 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 355 


6. Richard also took the Cooperative Trigonometry Test on the same 
day as he took the reading test and his teacher or counselor has shaded 
the confidence interval for his score (using the method described for the 
reading test score). Although the percentile rank of his trigonometry test 
score.is within the middle or “average” half of the distribution, it is sub- 
stantially lower than his quantitative score on the SCAT. The confidence 
intervals of those two scores do not overlap, indicating a likelihood that 
a real difference exists between them. One cannot automatically assume 
an educationally important difference, however, because the scores were 
earned on tests that were standardized on different student populations. 
The SCAT measures, being general tests of academic ability, quite natu- 
rally would be normed on a group of students who came close to being 
representative of all the students in the 11th grade of a number of high 
schools. The trigonometry tests, on the other hand, would as naturally 
be normed on a student group representative of students who take trigo- 
nometry. Since high school students who take trigonometry usually are 
a rather selected group, it is quite likely that the norms for the trigo- 
nometry test are “harder” (more difficult to earn an average score) than 
the norms for the quantitative tests in the SCAT series. This would lead 
one to think that perhaps Richard’s quantitative and trigonometry scores 
are not really as different as they seem to be. Nevertheless, a significant 
difference shows on the profile and Richard’s mathematics teacher could 
do no harm by trying to find out if some shortcoming in preparation, in- 
struction, or motivation is keeping Richard from doing as well in trigo- 
nometry as his quantitative ability score seems to indicate he can do.” 


The California Achievement Tests are designed to measure status 
of achievement and also to provide a basis for planning remedial 
instruction in areas where individual students may be deficient. The 
data shown on the profile are taken from the test manual. This profile 
makes use of the grade placement concept and also uses percentile 
ranks. Grade placements were established on the basis of the student 
population used to secure the norms for this particular test. There 
are six main divisions to the test and a number of subdivisions within 
several portions of the test. The limited number of items within 


2 Examiner's manual, Cooperative School and College Ability Tests (Princeton, 
New Jersey: Educational Testing Service, 1955), pp. 33-34. In 1957 a series of 
SCAT publications will be issued to replace the 1955 Examiner’s Manual. A revised 
profile with appropriate suggestions for interpretation will be contained in SCAT— 
Manual for Interpreting Scores. 


356 EVALUATING STUDENT PROGRESS 


SAMPLE PROFILE—COMPLETE BATTERY 
A Test Given in January to a 10th Grade Student. Age, 183 Months. Mental Age, 192 Months. 
DIAGNOSTIC PROFILE (Chart Student's Scores Hered 


gt Grade Plocement. 
e "mu 


1 1 6.0 70 80 90 100 11.0 120 130 140 15 
T mz SUPR I BONORUM 
2 mun 
o 
53 2 nuu 
TOTAL casaeceo) 90 E] waz) 
gr. . Following Directions - 10 t ak 
gs E Reference Skills + + -15 2. nnn n 
EH G. Interpretations -+ = 30 2/ - | 
NELTOTAL ero 5s B2) Z3) | 
TOTAL READING 145 [73] 
g [^ Number Concept - la UN 1 
Rje D end Rules - 
i$ C. Numbers 6 Pian w4 
i|» rmm ------ 15 
é 
x TOTAL (a+B+C+0) 60 
e 
Soft Addition Pd HN S 
BE] F- senec - - amis. irr non 1 
FEA c noie. 20 12. a; ! 
Ea 
ES] PH. division... ae AOR 
28 LTOTAL cron s [52] 12.20 35 10 45 
TOTAL MATH: eal aranne t aaa 
HU] pagos rmn V acne sU POSTURE teak SD 
E 
EE B. Punctuation - - . - - 10 163 
ME C. Words and Sentences 25 24 i 
58 D. Parts ot Speech — 17 LE 
Bafe smo -... L 
p TOTAL A+8+C+0+E) 80 
E 
££ Prorat speuuinc 30 Z] BAS) sm, v FEDES 
$ Teste E sere 
TOTAL LANGUAGE 110 [Z7] WEZIBO] 25,35 40 5 50 55 60 65 S MER 
oa nrc EI 
85 


TOTAL TEST 395 [£7] 125150 175 200. m 275 300 325 345 
erum 60 70 80 90 100 11.0 120 130 140 15 


PES 
ESTY 


CHRON: PK 9.9 
Fig. 26. Sample profile, California Achievement Test 


ee ESAE PRI NRI IE NOR RA 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 357 


some of the subsections of the test make it mandatory that the 
teacher using this test be cautious in interpreting student weak- 
nesses. The publishers of this test provide a checklist to be used 
along with the test to facilitate the diagnostic analysis of learning 
difficulties. The checklist is valuable in helping teachers identify the 
types of difficulties students may be having. 

The following analysis of the profile appearing in Figure 26 was 
made by the test publishers: 


Examination of the sample profile shows that the student has a chrono- 
logical grade placement of 9.9, an actual grade placement of 10.4, and 
an intelligence grade placement of 10.3, His Total Reading score gives 
him a reading grade placement of 10.2, which is satisfactory in view of 
his mental age; but it places him above his chronological grade place- 
ment. His Total Reading score also places him at the 50th percentile in 
comparison with the standardization population. 

Examination of the profile shows that the student is up to expectancy 
in Reading Vocabulary, and that his grade placement 9.8 in Reading 
Comprehension would have been higher except for his score in reference 
skills. These results indicate that the teacher should examine his specific ' 
responses to items in this sub-test and do remedial work where needed. 

His Total Mathematics score gives him a mathematics grade place- 
ment of 10,4-which matches his expectancy in view of his mental age, 
but it places him somewhat above his chronological grade placement. 
This Total Mathematics score also places him at the 50th percentile in 
comparison with the standardization population. 

Examination of the profile also shows that the student is slightly 
below expectancy in Mathematics Fundamentals, and that his grade 
placement of 10.8 in Mathematics Reasoning would have been even 
higher except for his score in numbers and equations. These results indi- 
cate that the teacher should examine his specific responses to items in 
numbers and equations and multiplication and do remedial work where 
needed. 

His Total Language score gives him a language grade placement of 
11.1, which is considerably above his expectance in view of his mental 
age, and it also places him considerably above both his actual and chron- 
ological grade placements. His Total Language score also places him at 
the 60th percentile in comparison with the standardization population. 

Examination of the profile shows, however, that whereas the student 
has a 12.8 grade placement in Mechanics of English and Grammar, his 


358 EVALUATING STUDENT PROGRESS 


Total Language grade placement would have been even higher except 
for his score in Spelling. This student needs remedial work in spelling. 


The Science Research Associates, publishers of the Iowa Tests of 
Educational Development, use still other means for developing pro- 
files. The profile for this test, Figure 27 makes use of standard scores 
and percentile ranks. According to the test publishers, “the standard 
score scales are so fitted to the raw score scales that a representative 
sample of pupils from grades 10 and 11 inclusive show the same 
normal distribution of standard scores on each of the tests (and on 
the composite). For such a sample, the median standard score is ap- 


GRADE 3. - 18T SEMESTER 
PROFILE OF STANDARD SCORES 


THe IOWA TESTS or 
EDUCATIONAL DEVELOPMENT 


TO PLOT PROFILE. MARK THE POSITION OF 
EACH SCORE ON THE STANDARD SCALE AND 
JOIN MARKS WITH STRAIGHT LINES. 

EACH DOTTED LINE JOINS THE SCORES COR. 
RESPONDING TO THE PERCENTILE RANK INDI 
CATED IN THE CIRCLE. FOR EXAMPLE. THE 75TH 
PERCENTILE FOR NINTH GRADE STUDENTS I$ 
APPROXIMATELY 15 FOR TEST 1. AND FOR TES; 
21513. 


THE STATE UNIVERSITY OF IOWA. 
]ENCE RESEARCH ASSOCIATES 


weet "--9-"-—e 


Fig. 27. Sample profile, lowa Tests of Educational Development 


proximately 15 for each test, while the range (the difference be- 
tween the highest and the lowest score) is approximately 30 for 
each test. The use of standard scores thus makes the test results 
highly comparable from test to test. It makes possible the drawing 
of a meaningful profile of test performance for the Iowa tests." 

The interpretation of the profile of John Jones’ achievements on 
the Iowa Tests of Educational Development as made by -the test 
publishers follows in part: 


The profile indicates that John’s general level of educational develop- 
ment is low. All but one of the scores falls below the 50th percentile and 
his composite score lies at the 25th percentile. 


? Manual, California Achievement Tests (Los Angeles: California Test Bureau, 
1951). 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 359 


This conclusion should, of course, be checked against whatever other 
information is available about John. Possibly his previous educational 
training has been of poor quality. He may have lacked interest in school 
work over a long period, or he may be below average in ability to learn. 
His low vocabulary and composite scores are some evidence of this, but 
it is also possible that he was not working to his full capacity when he 
took the tests. One would have to know much more about John to be 
sure of the cause of his low performance. 

Since John’s scores on Tests 2 and 6 were both low, it seems fair to 
conclude that his limited informational background in natural science 
probably contributed to his low score on the reading test in that area. 
If all of his reading test scores had been consistently lower than his 
other scores, one might think that his general reading skills were inade- 
quate. However, that is not the case. His performance on Test 5 was 
above the median for his grade. In fact it was better than the average 
score of his own class. Since the class average was a little higher on 
Test 5 than on Test 6, it is possible that the entire class has had more 
experience in the reading of articles on social studies topics than on 
scientific subjects. And John may be particularly well equipped in the 
specialized reading skills required by articles on social themes. Perhaps 
if his informational background in the field were improved, he would 
have scored even higher on Test 5 than he did. On the other hand, both 
improvement in background and extensive reading practice in the natural 
sciences may be necessary to improve his score on Test 6. Because of the 
marked peak in John’s profile, the teacher should also explore the possi- 
bility that he may have been “keyed up” or better motivated during one 
testing period than during others.* 


The profiles that have been presented are typical of the many 
different types of profiles that are currently being used to assist 
teachers to interpret test scores and to diagnose student abilities. 
While the profiles are helpful, it must be remembered that the 
causes of student behavior are varied and complex. 

The need for a rigorous interpretation of standardized test results 
becomes readily apparent when studies are made to determine the 
relationship which exists between such factors as test scores and 
school grades or achievement test scores and mental ability test 
scores, The data in Figure 28, taken from studies completed by the 


4 How to Use Test Results: A Manual for Teachers and Counselors for the Iowa 
Tests of Educational Development (Chicago: Science Research Associates, 1953), 
pp. 37-38. 


360 EVALUATING STUDENT PROGRESS 


California Test Bureau for use in validating a multiple aptitude test, 
illustrate the care which must be taken when using standardized 
test results for predictive purposes. 

The marks that students had received in English were correlated 
with the scores these same students received on the Language Usage 
Test of the Multiple Aptitude Test. It might reasonably be expected 
that students doing well on the test would do well in an English 


ENGLISH MARKS VS, MULTIPLE APTITUDE TEST 3, LANGUAGE USAGE* 
jp MALES. 
‘School Marks 
Te be A Ie Percentage of School Marks 
x[*T[e[» [re |4T» [e [» Tr [rom 


90+ | 15| 25] 31) 1| 0| 72 35|43| 1 | 0/100] 


[70.79 | 18| 55|108 |» ps] 9] 770-79 HIN 
Ben Ed 
60.69] 12| selies Ble E nyu uoo 


50.59| e a9lz29] 0 || 50.59 JAN gu gn 


s] 
Fn 
m. A 


30-39 (Du IUNII 
D 12 


0-25 FMM VV 
TORTE J 


10 2 30 4) 50 60 70 $0 *0 


wow. Ba dM dic Co fr 


ident, 4l. 
», 64.3. Standard 174 
ool nee d P tees, 91,10, 316; 11, 381; 12, dh; 12, 48. 


Fig. 28. Relationship of English marks and scores received on the 
Multiple Aptitude Test 3, Language Usage 


class. This expectancy is supported because a Pearson product- 
moment correlation of +.41 was secured. The extent to which test in- 
formation from the Language Usage Test of the Multiple Aptitude 
Test can be used to predict student grades in English is carefully 
analyzed in Figure 28. Of the group of students scoring the highest 
on the test, 56 per cent received grades of B or better, while 44 
per cent received grades of C or lower. Of the group of students 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 361 


scoring the lowest on the test, only 5 per cent received grades of B 
or better and 95 per cent received grades of C or lower. While the 
test does offer a good gross prediction of student success in this 
specific area, it does not permit the teacher or counselor to make 
positive judgments without recourse to other information about the 
individual student. This point is also made by the test publishers: 
“It is also important that other types of data be used and that test 
results be interpreted in terms of all the information that can be 
gathered.” 5 

Studies of the relationship between the factor identified as general 
mental ability and achievement also reveal data that can be helpful 
in diagnosis but also illustrate the difficulties faced in the interpreta- 
tion of test results, Figure 29 is an expectancy chart that relates 
scores received on the Nelson Biology Test published by the World 
Book Company to scores received on either the Terman-McNemar, 
Pintner, or Otis mental ability tests. As can be seen from the chart, 
the higher the student’s intelligence quotient on either of the three 
tests, the higher the score he would be expected to receive on the 
achievement test. This wouldn’t be surprising, since the factors 
measured by many “intelligence” tests appear to be the same factors 
needed for success in academic enterprises. 

What is interesting, however, is the wide range of expected achieve- 
ment scores for persons with similar intelligence quotients, For ex- 
ample, students in the Terman-McNemar 1.Q. range of 105-109 
could be expected to range from an achievement standard score of 
80 to 140. Thirty per cent of the students could be expected to have 
standard scores of less than 105 and 30 per cent could be expected 
to have standard scores of above 113. While the intelligence quotient 
may be useful in providing a gross measure of student ability, it is 
readily apparent from the expectancy table that many other factors 
must enter into a determination of the achievement ability of stu- 
dents. 

The test publishers recommend that student scores be plotted 
directly on the chart and that the position above or below the heavy 
black line be used as an indicator of learning effectiveness. The 
“band” concept used in describing scores on the Cooperative Test 

5 Manual, Multiple Aptitude Tests (Los Angeles: California Test Bureau, 1955), 
p. 43. 


362 EVALUATING STUDENT PROGRESS 


NELSON BIOLOGY TEST 
EXPECTANCY CHART 


DEVIATION LQ. 


u 
x 
9 
v 
E 
& 
8 EE 
HE 
alalula ata B BASIC DATA Ee] B 
Bra 2 SD Achre" 95 
ejas Mach, =108.8 Mint, =1054 4 
B IERI IERI ERI P2 RIETI SDAa 12.1 SDisg 14.2 i 
TAdkis.-.63 N «300 i 
(UTR DRE Sock 4T r £ 
3 i: z Doman na Esos 5^ E 


T-M. LQ. +[10 = mrs ~ 9 [no - oa [ns - 09/90 - 4] 93-99 [oos [os-109 [vo t [us - vo ao-ian «tas ino t [notos ino [aas] 


Acte | oso | oni |a | sas | ior | roso [ines | wae | n | wer |n [no | no f iasa [nen | ans 
Moan —| i 


Published by World Book Company, Yonkers-on-Hudson, New York, and Chicago, Ilinois 


Fig. 29. Sample expectancy chart 


described earlier would probably be useful here. It can be expected, | 


however, that the greater the departure from the heavy black line, jJ 
the more dependable is the judgment that the student's achievement 
is exceptionally good or poor. Students within the range of the first od 


broken line on either side of the solid black line are probably doing 
as well as might be expected on the basis of available evidence. 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 363 


As has been repeatedly explained, diagnosis of the strengths and 
weaknesses of students requires a comprehensive analysis of many 
factors. It is not an impossible job but it is a responsibility that de- 
mands time and attention. It cannot be done if reliance is placed 
upon one test battery or even upon a group of tests from a single test 
publishing source. Even test publishers concur in the conclusion that 
the standardized test battery is a useful tool in securing needed in- 
formation for diagnostic purposes, but it is only useful to the extent 
that it supplements a comprehensive program of measurement and 
evaluation. 


A COMPREHENSIVE DIAGNOSTIC PROGRAM 


One of the major problems that confronts teachers in evaluating 
student progress concerns the manner in which classroom materials 
are used in the program of evaluation. A proposal for solving this 
problem, a plan for getting evidence of pupils’ all-around growth 
and development, has been advanced by Paul Diederich, of the Edu- 
cational Testing Service. While the plan is designed for use by the 
staff of a school, with slight modifications, it can be used by the 
teacher for an individual classroom. 


THE PROFILE INDEX 


The “profile index” is designed to keep track of data on pupil de- 
velopment and to give at a glance some idea of what these data 
mean. It is not a rating scale. At no time does any teacher or 
counselor record on the profile index his subjective opinion of any 
aspect of a pupil’s growth. It is assumed that teachers in all fields 
will be collecting evidence at various times throughout the year on 
those aspects of pupil development which the school regards as 
important. As soon as these teachers are through with the evidence 
for their own purposes, instead of dumping it in a wastebasket, they 
deposit it in a box in the central office. A clerk will then sort it into 
the mailboxes of the counselors of the pupils concerned. The counsel- 
ors are responsible for filing this material in the pupils’ folders and 


© Paul Diederich, “Steps in the Development of a Total Program of Student Ap- 
praisal," Pupil Appraisal Practices in Secondary Schools, Circular No. 363, Office of 
Education (Washington, D.C.: Government Printing Office, 1952), pp. 42-54. 


364 EVALUATING STUDENT PROGRESS 


for recording it at the same time on the profile index. They do so as 
follows : 

Let us assume that Counselor X finds two pieces of evidence in 
his mailbox. The first is a form labeled *Record of Incomplete and 
Unsatisfactory Work." It indicates that Mary Smith failed to com- 
plete an important assignment in English on the due date. The excuse 
She wrote on the form at the time was unconvincing; she suggested 
a later date for completing the assignment but missed it by three 
days ; and the teacher's comment on the affair was most unfavorable. 
If the school is systematic in its collection of data, this evidence 
will probably come in labeled *B12-W," indicating that it refers to 
the objective numbered B12 in the profile index (“gets things done 
on time"), and that it shows a weakness in this aspect of develop- 
ment. After perusing this bit of evidence, the counselor opens Mary 
Smith's folder and finds that there are already twenty-six pieces of 
assorted evidence in it; that is, the piece on top is numbered 26. 
The new evidence, then, becomes number 27. The counselor writes 
this number on the top of the new evidence and places it in the 
folder on top of number 26. It is always to remain in this position 
in the folder. The counselor then writes the number 27 on Mary's 
profile index (which is always uppermost in her folder) opposite 
objective B12 and in the column labeled Weak. 

The second piece of evidence is a biology test taken by John Jones. 
The teacher's notation indicates that it refers to objective F1, knowl- 
edge of the natural sciences, and shows strength in this area; also 
to objective F13, interpreting data, in which the student placed in 
the middle half of the group tested—a status which the profile index 
calls "average." Since there are already thirty-four pieces of evidence 
in Jones' folder, the new one becomes number 35. The counselor 
writes this number on the top of the test and places it in the folder 
on top of number 34. He then writes the number 35 in two places on 
John's profile index: opposite objective F1 in the column labeled 
Strong, and opposite objective F13 in the column labeled Average. 

As the profile index fills up with numbers, it will show at a glance 
the aspects of development on which the school has collected data 
for any given pupil and the degree of attainment of each objective 
indicated by the evidence. If the counselor is worried by the weak- 
ness revealed in an area, such as "exercises self-control,” he can 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 365 


instantly locate the evidence bearing on this objective, for the num- 
bers written after it will tell him the serial order of this evidence 
in the folder. If the counselor does not have time to write down 
one number for each piece of evidence that he gets, it is plain that 
no record at all will be kept, for it is impossible to write less than 
one number. The evidence will simply be dumped into folders with- 
out passing through the mind of anyone who knows the pupil. As the 
folders fill up with unassorted evidence, the task of reading it and 
making any sense at all of it will become impossible. Then the 
folders may as well be thrown away. 

The appended profile index uses as its main headings six major 
values which are commonly held to be essential elements of a good 
life or of happiness under present conditions of life in our society. 
The objectives listed under these values are the knowledge, skills, 
habits, interests, and attitudes which are believed to increase the 
chances of attaining these values, both individually and collectively. 
More objectives are listed than any one school would want to use; 
rather, each school would use its own list and might easily get it on 
one or two pages. The columns labeled Weak, Average, and Strong 
refer usually to standing in the lowest quarter, the middle half, or 
the top quarter of the group tested—or, less precisely, to unfavor- 
able, average, or favorable evidence. If a test yields an exact per- 
centile rating on national or local norms, the number should be 
recorded as near this point as possible ; each dot represents five per- 
centile points. The crowding of dots in the center is intentional; 
normally the interquartile range on a test is shorter than either the 
top or bottom quarters. Some space is left at the bottom of each page 
for additional objectives to be written in as they are adopted. 

It is hoped that schools will gradually abandon marks in courses 
as their sole record of the development of their pupils. Instead, 
teachers should collect evidence of the development of those charac- 
teristics which increase the chances of attaining happiness, both as 
individuals and as a society. The collection of such evidence should 
not be left to chance—although, once the system is established, a 
great deal of valuable evidence will come in by chance. A standing 
committee on evaluation should decide what evidence is needed and 
where it can be gathered most conveniently. It should schedule the 
collection of specified evidence at times scattered throughout the 


366 


EVALUATING STUDENT PROGRESS 


Profile Index 


AVER- 
A. LIFE-MAINTENANCE WEAK AGE STRONG 
075257780. £5; 100 


Necessities (food, clothing, etc.) 


1. 
2: 


Knows how our own and other 
economic systems operate 
Knows about distribution of chief 
natural resources; resists wast- 
ing them 


. Respects property rights, con- 


tracts, and regulations affecting 
them 


Practical competence in— 


4. 
5. 
6. 
(hs 


11. 


Shopping, buying wisely 
Cooking, serving, dining 
Cleaning, keeping things in order 


Caring for children 


. Making things, making repairs 


9. 
10. 


Care of house, grounds, property 
Care of money, banking, insur- 
ance 


"Traveling, driving a car 


Health 
Healthful attitudes 


12. 


13. 


Security: self-confidence, poise, 
independence, flexibility, cheer- 
fulness 

Affection: is able to give and re- 
ceive affection; shows good will 
toward others, etc. 


elits Ye 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 367 
Profile Index (cont.) 


AVER- 
WEAK AGE STRONG 


0/525, 250, 7S) 100 

14. Health knowledge: physiology, 
psychology, hygiene, etc. 

15. Health habits: diet, play, rest, 


cleanliness, medical care, etc. IUE Fed. She St EERS 
16. Public health: supports and obeys 

public health measures pat eR UT HIA s, 
17. Safety: obeys safety regulations, 

takes reasonable precautions Be SSUES SS ee 


B. SENSE or WORTH OR ACHIEVEMENT 


1. Is developing a picture of self 
and of an acceptable role in life 
which can be sustained 

2. Sets reasonably high standards 
and tries to live up to them 


3. Has sense of belonging to a so- 

cial group without undue de- 

pendence on it [E E A dr ct EA a 
4. Wins recognition and acceptance 

for desirable traits or accomplish- 


ments pE E 
5. Is developing vocational interests 
and competence RG Aa hn HE 


6. Regards occupation as a contri- 
bution to the common welfare, 
not as a struggle to take some- 


thing away from others JO SUNL eae RE e a 
Work habits or traits: 
7. Self-direction, initiative as Sh Sa A lure. 
8. Industry, perseverance, thor- 
oughness 


9. Honesty, responsibility 


368 EVALUATING STUDENT PROGRESS 
Profile Index (cont.) 


AVER- 
WEAK AGE STRONG 


0 25 50 75 100 
10. Good judgment, decisiveness 


11. Orderliness, system, neatness 
12. Gets things done satisfactorily 
and on time 


C. FRIENDLY RELATIONS WITH OTHERS 


1. Likes people Hee AE AS A 
2. Takes reasonable care of appear- 
ance 


3. Has a pleasant speaking voice 

4. Can entertain others in conver- 
sation PET EE Y SRR I cre 

5. Can dance, take part in group 
singing and play popular games 


6. Is courteous, tactful, pleasant 

7. Is honest, candid, truthful Mg Mi a MUN 
8. Is tolerant 

9. Has a sense of humor 


10. Exercises self-control 


D. A Free Society 


1. Shows interest in and concern for 

the general welfare NORTON ADAMS UE rSn L, 
2. Sees social significance of cur- 

rent happenings 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 


369 


Profile Index (cont.) 


Relates present issues to their 
historic background 

Has consistent and enlightened 
attitudes toward current social 
issues 

Can discover, evaluate, and pre- 
sent facts relevant to social issues 


Can detect propaganda 

Knows techniques of social ac- 
tion (eg., how to get a law 
passed) 

Is willing to devote time, money, 
and effort to public affairs 


Respects law and its agencies 


E. AESTHETIC EXPERIENCE 


ils 
2. 
3. 
4. 
5. 
6. 
T. 


Seeks contact with nature and 
finds refreshment in it 

Practices at least one of the arts 
and enjoys several 

Is moved emotionally and stirred 
intellectually by literature 
Listens to good music on the 
radio and phonograph 

Can sing a part in group singing 
or play a musical instrument 
Responds to artistic qualities in 
painting and other visual arts 
Appreciates drama on the stage 
and on the radio 


F. MEANING 


Is acquiring, integrating, and apply- 
ing knowledge of— 
1. The natural sciences 


AVER- 
WEAK AGE STRONG 
ORWAZSi SOW 173301100 


370 


a o 


6 


Is developing skill in— 
7. Reading 


19. 


. The social sciences 


. Literature and language 


. The arts 


. Writing 


. Speaking 


. Listening 


- Applying principles 


. Seeing relationships 
. Detecting assumptions 


. Criticizing an argument 


EVALUATING STUDENT PROGRESS 


Profile Index (cont.) 


WEAK 


AVER- 


AGE STRONG 


0. 25 


50 75 100 


. Philosophy and religion 


Vocational studies 


. Foreign languages 


. Mathematics 


. Interpreting data 


. Classifying, defining 


q 


Insight and judgment 


SUMMARIES AND INTERPRETATIONS 


DIAGNOSIS FROM THE RESULTS OF MEASUREMENT 371 


year, so as not to overburden the pupil or teacher. It should see to it 
that the evidence flows in to counselors, clearly marked with the 
number of the objective or objectives to which it refers and with a 
letter or other symbol (e.g., a percentile rank) for the degree of 
attainment of each objective which it indicates. It should make cer- 
tain that the evidence passes through the mind of someone who 
knows the pupil and who feels responsible for his all-round develop- 
ment. The profile index will both stimulate and facilitate these 
processes. After some further development it ought to be adopted, 
at least as a supplement to, and possibly as a substitute for, the 
present system of academic bookkeeping. 

While the profile index is designed for use by an entire staff of a 
school, it is possible to develop such an index for individual class- 
room use. Should the teacher desire to use this plan it would merely 
be necessary to identify the specific classroom objectives that form 
the basis of the activities and continue to collect all the available 
evidence that is attainable. A folder for each student could be used 
to hold all of the data. This particular scheme offers teachers an 
excellent means of systematically collecting all of the evidence 
which is so necessary for educational diagnosis of student be- 
havior. 


DIAGNOSIS BEFORE PROGNOSIS OR PRESCRIPTION 


The diagnostic approach to evaluation in the classroom is de- 
signed to assist students to achieve their maximum potential. Much 
as the physician uses a variety of techniques before attempting to 
prescribe a remedy for an ailment, the alert teacher uses a variety of 
techniques to discover the causes of student limitations, weaknesses, 
and problems. Whether the cause is within the individual, within the 
environment of the individual, or within the classroom, it behooves 
the teacher to find out why the student behavior is as it is. Only then 
can appropriate actions be taken. In many instances, even after the 
diagnosis is complete, an appropriate prognosis or a prescription 
cannot be found. This is common not only to the field of education ; 
in numerous instances physicians with the most modern of diag- 
nostic techniques cannot determine the cause of certain internal 


Se ia Ste 


= 


Oe 


/.372 EVALUATING STUDENT PROGRESS - 


physical difficulties. Working within the limits. of his ability, the 
teacher can take steps to discover ways and means of improving the 
instructional program and in turn of meeting the problem of individ- 


ual differences. 


CHAPTER 
17 


Guiding Student Progress 


GUIDANCE IS A SOMEWHAT confused and abstract concept. It is 
thought of variously by different people as “taking care of problem 
cases" ; “giving tests”; “giving advice" ; “amateur psychiatry” ; “just 
extra work.” There are even those schools of thought that consider 
guidance as the exclusive province of the specialist, only remotely re- 
lated to the day-by-day functions of the teacher, or as a method 
* of relaxing or lowering the academic standards “to make sure that 
no one fails.” 
To be sure, guidance does attempt to help students solve personal 
` problems, but it is more concerned with prevention than with ther- 
apy; it does make use of tests as one way of obtaining data, but it 
also employs a wide variety of other means and methods of student 
appraisal. It frequently provides students with needed information, 
but it does not give advice of the usual over-the-garden-fence variety, 
and it is not amateur psychiatry. It may seem to be extra work to 
the teacher who conceives of his function to be only that of hearing 
lessons or browbeating his students into submission. Finally, guid- 
ance is not solely the concern of the specialist, nor does it subscribe 
to the philosophy of “soft” education to eliminate failure. On the 
contrary, guidance attempts to raise the general level of human 
achievement by making it possible for every student to develop to 
the limits of his individual capacities. 
The school in which the youth is the focus of the educative process 
does not set a single arbitrary standard which every student must 


373 


374 EVALUATING STUDENT PROGRESS 


meet in order to “pass.” To do so would be as ridiculous and illogical 
as to label every citizen a failure who did not amass a bank account 
of $10,000, grow to a height of 6 feet, acquire an estate of 500 acres 
of land, or run 100 yards in 10 seconds! Such requirements would 
be unfair, you say! Everyone doesn’t consider that he needs a bank 
account of $10,000 to be happy or successful; everyone doesn’t have 
the inherited potential to grow to a height of 6 feet; everyone isn’t 
interested in owning 500 acres of land; everyone doesn’t have the 
physical attributes or skill necessary to run 100 yards in 10 seconds. 
Yet, regardless of his needs, potentials, physical and mental capaci- 
ties, or interests, every student is expected to reach the same level 
of attainment in every phase of his school experience. Obviously, 
such a level is far too high for some, much too low for others, and, 
except for a chance few, inconsistent with the student’s interests, 
desires, needs, and goals as well. 

It should be fairly evident by now that to make education mean- 
ingful and to give it purpose and direction requires a functional 
recognition of the individual differences in students based upon an 
accurate and complete knowledge of all the factors that, singly and 
in combination, condition the learning process for each individual. 

Meaning and purpose are given to education within the framework 
of the guidance program, which is the organized effort of the school 
to provide an environment which will allow each student to grow 
to the limits of his capacities and to develop the competencies, skills, 
attitudes, and characteristics which will make him socially useful 
and give him personal happiness and satisfaction. 

The concept of the guidance program presented in the preceding 
Paragraphs suggests or implies that guidance— 


1. Is a responsibility of every member of the school staff. 

2. Must be planned and organized to assure a comprehensive pro- 
gram, a consistent philosophy, and an avoidance of overlapping 
and duplication of effort. 

3. Is concerned with the needs of the individual as well as with the 
needs of society. 

4. Reaches and positively affects every student. 

5. Is more concerned with prevention than with therapy (although 
recognizing the value of the latter in specific cases). 

6. Avoids, as much as possible, teacher and/or administrative domi- 


GUIDING STUDENT PROGRESS 375 


nation of the student and stresses cooperative teacher-student 
planning. 

7. Encourages and fosters the development of traits of self-control, 
self-direction, and self-reliance and responsibility. 

8. -Assists in the development of a realistic self-concept. 

9. Isan integral, rather than an extra, part of the teaching function. 

10. Provides the student many opportunities for solving problems 
and making decisions through the intelligent application of facts 
and principles. 

11. Firmly believes in the concept of the “whole child,” i.e., that the 

^ students’ emotional, social, physical, educational, and vocational 
needs are interrelated and interdependent. 

12. Aids the student in formulating realistic educational and voca- 
tional goals, and assists him in perceiving the relationship be- 
tween the goals and his educational experiences. 

13. Recognizes the motivating power of the basic human needs for 
recognition, success, status, belonging, and security. 

14. Collects and records as much factual data about each student as 
it can reasonably expect to utilize. 

15. Recognizes the value of the concept of “developmental tasks" 
as basic to an understanding of the behavior of youth. 

16. Is predicated on the proposition that the administrative organiza- 
tion of the school is flexible enough, and the curricular offerings 
varied enough, to permit adapting the students’ educational ex- 
perience to his changing interests, needs, and desires. 

17. Has access to the services of specialists—physicians, dentists, 
nurses, psychologists, psychiatrists, social workers—for referral 
of difficult and unusual cases. 

18. Incorporates counseling as an essential phase of a complete pro- 
gram. 

19. Recognizes that subject matter is a tool to use in the achievement 
of the more fundamental objectives of good citizenship and social 
desirability. 

20. Realizes that only through close cooperation and interaction with 
other agencies of the community can the best interests of youth 
be served. 


In order to provide the kind of environment in which each student 
can mature to his maximum stature—socially, vocationally, emo- 
tionally, scholastically, mentally, and physically—the guidance per- 
sonnel, and especially the teachers, must— 


376 EVALUATING STUDENT PROGRESS 


` 1. Know the student as a whole person—his strengths, weaknesses, 
likes, dislikes, attitudes, beliefs, values, and goals. 

2. Know the school—its philosophy, facilities, curriculum, and per- 
sonnel. 

3. Know the community—its occupational, recreational, religious, 
and educational opportunities; its cultural influences, its service 
organizations and agencies; its dominant educational philosophy 
and values. , 

Further, it is essential that all guidance personnel also have a 
working knowledge of the techniques of school and community evalu- 
ation. Only through a planned program of appraisal of both the 
student and his environment can adequate data for effective guidance 
be obtained. 


CUMULATIVE RECORDS 


Every teacher must recognize that there is an almost limitless 
number of factors—innate, acquired, and environmental—which op- 
erate singly or in combination to affect learning either positively or 
negatively, i.e., either to facilitate or to inhibit it. It is necessary for 
the teacher, therefore, who is to guide learning effectively to know 
as much as possible about as many of these factors as he can. The 
teacher's task of "learning" his students is made infinitely easier if 
the school records and maintains all the accumulated data about 
each student in one place—on a card or sheet, in an envelope or in 
a folder, or by some other means. Such a device, which provides a 
developmental picture of the student’s growth and development, is 
called a “cumulative record.” A 

The record is an extremely useful tool in student guidance, but the 
teacher (and the administrator) must keep in mind that no tool, no 
matter how potentially useful it may be, is of value unless it is used. 

Although many different types of record forms are available com- 
mercially, it is usually desirable for each school to develop its own 
form, based upon— 


1. The purposes for which it is to be used. 

2. The type and volume of the data to be recorded. 

3. The time and facilities available to teachers and guidance workers 
for procuring, recording, and utilizing the data. 

4. The facilities for filing. 

5. The educational philosophy and objectives of the school. 


—X M 


GUIDING STUDENT PROGRESS 377 


The ideal record form is one which grows and develops with the 
staff and provides for the addition of new areas of information as 
the need for such data becomes apparent to the teachers. It is better 
to start with a modest cumulative record in which a few significant 


facts 


are recorded about each student and used than to expend all 


the energy of the staff in procuring and assembling a large body of 


data, 


leaving neither the time nor the inclination to apply the infor- 


mation to the actual guidance of students. Every school would do 
well to adopt as guideposts in the development of its own record 
system the features of an ideal cumulative record system as reported 
in the Handbook of Cumulative Records. 'This system— 


Ult 


Presents those facts and impressions which staff members consider 
to be most significant in revealing and shaping the development of 
students. 

Clearly indicates the trends of growth and the potential strengths 
and weaknesses of students. 

Builds up information on each area of a student's experience and 
development over a period of years. 

Presents information so clearly that a new counselor, principal, or 
teacher can read and understand the record without difficulty. 

Is used by all staff members as an aid in their daily work with 


students. 
Requires no more clerical work than can be justified by its prac- 


tical use. 
In form and content is developed and constantly improved through 


the cooperation, study, and experimentation of all staff members.’ 


imately, not necessarily at the beginning, the cumulative record 


ought to include information concerning the students'— 


Suggested procedure for 


Identity obtaining data: 
Name Personal data sheet 
Address Questionnaire 
Place of birth 
Date of birth 
Sex 


Race (or color) 


! Handbook of Cumulative Records, Bulletin No. 5, Federal Security Agency; 
Office of Education (Washington, D.C.: Government Printing Office, 1944), p. 11. 


378 EVALUATING STUDENT PROGRESS 


Home and community background 
Name and address of parents or guardians 
Age of parents or guardians 
Birthplace of parents or guardians 
Health of parents or guardians 
Education of parents or guardians 
Language spoken in home 
Type of neighborhood in which home is located, or economic status 
Cultural or racial customs of significance 
Number, age, and sex of siblings 
Facilities and opportunity for study in home 
Marital status of parents (broken home, separated, divorced, etc.) 
Attitude of parents or guardians toward school 
Significant experiences 
Family relationships 


School history and record of class work 


Schools attended (names, locations, and years) 

School marks by years and subjects (include summary of past school 
record) 

Record of unusual successes or difficulties 

Attendance (absence and tardiness) by years 


Health 


Ideally, the results of a complete physical examination by a physi- 
cian or nurse should be recorded for each student. However, in most 
instances it is up to the teacher to note and record health data, except 
for those cases presenting obvious defects or disabilities. The fol- 
lowing items of health information are suggested by Rogers in the 
pamphlet, “What Every Teacher Should Know About the Physical 
Condition of Her Pupils,” ? as a summary of important points for ob- 
servation: 


General Hair and scalp Record of inoculations 
Face and lips Eyes and vision Communicable diseases 
Scholastic attitude 
Capacity for learning I.Q. Profile 
Mental age Percentile 


? James Frederick Rogers. Pamphlet No. 68 (rev. 1945), Federal Security Agency, 
U.S. Office of Education. 


GUIDING STUDENT PROGRESS 379 


Scholastic achievement 
Present level of knowledge and skill achieved: 
Success by subject areas 
Special successes 
Strengths and weaknesses 
Courses failed or dropped 
Test scores 


Raw Grade Percentile 
Derived Age 
Special aptitudes and talents 
Athletic Literary Musical 
Artistic Mathematical Scientific 
Clerical Mechanical Others 
Interests 
Educational Vocational Leisure-time 


Personal characteristics and behavior 
Significant factors: 


Level of maturity Aggressive—submissive tendencies 
Physical Happiness 
Psychological Attitudes 
Social Peer relationships 
Emotional Problems, conflicts, frustrations 
Degree of “socialization” Fears and insecurities 
Cocurricular activities: 
Types Success achieved: 
Number Awards 
Offices held Commendations 
Work and other nonacademic experiences 
Types of work Interest shown 
Duration Nature and significance of other 
Success achieved: nonacademic experiences 
Promotions 
Commendations 


Plans for future 
Educational Vocational Personal 


380 EVALUATING STUDENT PROGRESS 


Post-school activities 

The school should retain contact with its students as a basis for 
evaluating and improving its total educational program. Data should 
be obtained and recorded concerning ex-students: 

College success 

Success in technical or other special schools 

Vocational success 

Personal and/or family life 

It is imperative, therefore, since the cumulative record in most 
‘cases serves as the major source of information concerning a stu- 
dent, and is the basis upon which rests most of the counseling carried 
‘on with boys and girls, that it (the cumulative record) be— 

1. Valid (truthful). The greatest care must be exercised to check 

“all information for authenticity and truthfulness at the time it is 
obtained or, at least, before it is recorded. Rumors and secondhand 
reports must not be recorded unless their veracity is proved. Test 
‘scores should be viewed with skepticism until the reliability and 
validity of the tests have been determined and the conditions under 
"which they were administered and scored are known. An absence of 
information is better than untrue information, for judgments and 
conclusions based on untruths are almost certain to be wrong. 

2. Comprehensive. It must include information about each and 
‘every aspect or phase of the student’s developmental history. It is 
not enough to know from the cumulative record that a student has 
‘certain physical characteristics, interests, and intellectual poten- 
tialities.-The record must provide both a longitudinal and a cross- 
‘sectional picture of the whole student at any given time. 

3. Complete. “The truth, the whole truth and nothing but the 
truth” might well summarize the first three criteria of a good per- 
sonnel record. Not only must a record include valid data about all 
phases or aspects of the student's growth and development; also it 
must include sufficient numbers of facts concerning each phase so 
that the resulting picture of the student will be complete in every 
‘detail. A single intelligence or achievement test score would hardly 
be adequate as a basis for judgment concerning a student's intel- 
lectual potential or his academic accomplishment. Neither would a 
record of a pulse rate provide an accurate gauge of a pupil's 
health or physical condition. “A little information is a dangerous 


pas 


GUIDING STUDENT PROGRESS . 381 


thing” is a paraphrasing of an old cliché which might well be used 
as a motto to guide the teacher or the counselor in appraising the 
adequacy of his cumulative records. Half-truths and isolated scraps 
or bits of information, no matter how truthful they may be in them- 
selves, present a very distorted picture, much like a jigsaw puzzle 
in which only a few pieces have been fitted into place. 

4. Accurate. No amount of information is worth the paper it is 
written on if it is not recorded accurately. An 1.Q. of 163 recorded 
as 103 would make a considerable difference in the type of educa- 
tional program suitable for that student. It goes without saying that 
only persons who are thoroughly competent and dependable should 
he entrusted with the task of transcribing data from the original 
sources to the student's personnel record or folder. 

5. Reliable. Any marked deviations in the recorded data must re- 
flect actual changes in the student. To this end, there must be a high 
degree of objectivity in the record. Personal opinions and attitudes 
should be kept to a minimum. Of course, only scores from reliable 
tests should be recorded. In those instances where interpretations or 
opinions must be recorded, they should be so labeled in order to 
avoid confusing fact with fancy. 

6. Usable. Obviously all the information in the world is of no value 
unless it is used. How much and how effectively it is used depends 
to a large extent upon the attitude of the user—teacher, counselor, 
or others—but this attitude is greatly influenced by certain factors 
directly related to the record itself. The record file must be easily ac- 
cessible; data must be so recorded as to facilitate easy interpreta- 
tion; all essential data should be included to obviate the necessity 
of referring to several sources for information; data should be ar- 
ranged so as to show development, growth, and progress, including 
periodic summaries of anecdotal records. i 


GUIDANCE HELPS THE STUDENT KNOW HIMSELF 
The goal of all guidance is the development of a human being who 
is happily and satisfactorily adjusted to his environment. Such a 
condition is chiefly dependent upon two factors: 
1. The student must have an adequate and realistic self-understand- 


ing or self-concept, based upon a knowledge of as many facts and 
factors as he can obtain about himself; and 


2: 


EVALUATING STUDENT PROGRESS 


He must have a functional knowledge and understanding of the 
environment in which he will live and to which he is expected by 
society to adapt himself. 


Such knowledge and understanding of self and environment pro- 
vide the youth with the basis for sound and realistic decisions con- 
cerning— 


1, 


2. 


3. 


School. School or schools to attend, courses of study to pursue, 
subjects to take, cocurricular activities to elect 

Vocations. Vocational interests, occupational requirements and 
preparation, opportunities for employment and advancement ` 
Personal-social affairs. Relationships with parents, peers (both 
sexes), school (teachers and administrators); affiliations (reli- 
gious, political, fraternal, recreational); civic obligations and re- 
sponsibilities 


Opportunities for the student to know himself and his world can 


be provided in many ways in the secondary school ; chiefly through— 
T: 


> 
Guidance classes, self-study classes, life-problems classes, vocas. 
tions (occupations) classes, and the like. In such classes the stus 
dent studies occupations and the world of work; studies human. 
behavior and human relations; and learns about himself through 
tests and other structured devices. 

Guidance “units” as part of a more comprehensive course, such as 
homemaking, community life problems, social studies, and health. 
Units such as these provide the same opportunities, although on 
a more limited scale, as do the semester courses described above. 
Homeroom activities, such as hobby groups, elections, participating 
in and conducting meetings, clubs, and “drives.” Such experiences 
give the student an opportunity to determine his areas of interest, 
as well as any special talents or aptitudes he may have. 

Courses, both required and elected, which furnish the student with 
real and convincing evidence of his abilities, interests, and suc- 
cesses (and failures!) in a variety of academic and practical sub- 
ject fields. 

Cocurricular experiences (athletics, drama, music, public speak- 
ing, and so on) offering much the same type of opportunity for 
self-appraisal as do the more academic courses referred to above. 
Work experience programs, which allow the student to sample sev- 
eral vocational fields while he continues to attend regular school 
classes. He learns the extent of his aptitude and interest in certain 


4 


GUIDING STUDENT PROGRESS 383 


fields of work from actual firsthand contact with the job and the 
people with whom the job requires him to associate. 

7. Out-of-school experiences, such as travel, church, and club activ- 
ities, youth groups, and daily activities of living—all of which con- 
tribute to the youth’s fund of knowledge about himself, his strong 
points as well as his weaknesses, his likes and his dislikes. 

8. Counseling, in which the student thinks seriously and purposefully 
about himself, tries to analyze and understand himself, and devel- 
ops a realistic self-concept based upon that understanding. Coun- 
seling, in a sense, is the culmination of all the efforts, consciously 
or unconsciously made, to acquire a-sound and sensible attitude 
concerning one’s personal assets, liabilities, and goals. 


GUIDANCE HELPS THE TEACHER 


The question might now properly be asked, “Does guidance help 
the teacher ?” The answer is “yes.” Guidance helps the teacher to do 
» a better job of assisting the student to make the most of his assorted 
interests, aptitudes, and abilities. In that sense, then, guidance is 
"Miltimately and entirely for the benefit of the student. 
— Specifically, however, a guidance program makes it possible for 
the teacher to provide more adequately for the individual needs of 
his students by assisting him— 


1. To plan educational experiences built around a knowledge of the 
individual student's needs, interests, capacities, aptitudes, goals, 
desires, past achievements, and experiences. 

2. To select suitable teaching methods in keeping with each student’s 
unique needs and characteristics. 

3. To develop a common core of understanding with parents con- 
cerning the needs and problems of every boy and girl. 

4. To improve relations with students through a more adequate un- 
derstanding of their motives, problems, and needs. 

5. To reduce disciplinary problems. Many, if not most, such problems 
arise because of frustration or failure to fulfill certain basic needs 
of an inflexible, subject-centered, faculty-dominated school. The 
alternative to such a situation is the guidance-minded school where 
individual differences are considered, where the program of the 
school is tailored to fit the needs of its students, and where oppor- 
tunities for the students to develop the skills necessary for self- 
determination in a democracy are adequately provided. - 


384 


EVALUATING STUDENT PROGRESS 


To locate the causes of behavior problems which may or may not 
be commensurate with capacity. The ready availability of com- 
plete data concerning all possible sources of difficulty—home, phys- 
ical, personal, and school—as well as opportunities for gathering 
additional information with a minimum of time and effort’ make 
this possible. 

To obtain referral sources and information. Such outside help 
might come from the school or the community in the form of psy- 
chologists, physicians, nurses, dentists, remedial reading specialists, 
speech correctionists, local, state, and federal agencies, and others. 
To develop a professional attitude toward teaching by making the 
growth and development of the student the ultimate aim of the 
school and giving the individual—the student—priority over sub- 
ject matter as the focal point of the teacher’s efforts and attention. 


The teacher who willingly and enthusiastically performs guidance 
functions must, of necessity, possess a guidance point of view. He 
must believe it to be vitally important not only to know each stu- 
dent but also to understand him. To know him means to— 


i5 


2. 


4. 
Ea 
6. 


7 


Know his level of achievement, skill, and knowledge in the various 
areas of curricular and cocurricular experience. 

Know his general capacity for learning as well as his special apti- 
tudes and talents. 

Know his shortcomings or needs—academic, social, emotional, vo- 
cational, personal, 

Know his dominant interests, motives, goals. 

Know his health and physical condition. 

Know his personal problems, attitudes, fears, and emotional dis- 
turbances. 

Know his family background and significant experiences. 


To understand him, and therefore to be able to guide him, the 
teacher must— 


1. Realize that all behavior is caused and that any act is, therefore, 
“natural” under the circumstances. Every act of a human being is based 
upon, or influenced by, to some extent— 


a. What has happened to him in the past. 

b. What is happening to him at the present moment, 

c. What he hopes, expects, or wishes will kappen to him in the 
future. 


GUIDING STUDENT PROGRESS 385 


2. Recognize that each person is a unique individual, differing from: 
every other person in the magnitude and pattern of factors which deter- 
mine his characteristics and actions at any particular time. Among the 
factors which are operative in this respect, and which make Joe Doe 
different from Jerry Berry are— 


ang 


a = e e 


B 


Physical size, shape, and proportion 

Facial characteristics, complexion, and voice 

Glandular functioning 

Capacity for acquiring mental and physical skills—general andi 
specific aptitudes 

Energy available for activity 

Knowledges and skills possessed 

Moral, spiritual, and ethical standards and values 


. Attitudes and beliefs—life philosophy and goals 


Peer adjustment 
Home and family background and relationships 


. Self-concept 


General experience background 


. Unusually significant or disturbing experiences 


3. Be conscious of the fact that during every stage of growth the 
' youth is confronted with a number of common developmental tasks 
which he must master before he is completely ready to proceed to the 
next stage of his growth cycle. The nature of these tasks depends upon 
the maturity level of the boy or girl. During adolescence each youth 


faces the task of— 


m pho. 


Accepting his physique and accepting a masculine or feminine 
role. 

Developing new relations with age mates of both sexes. 
Acquiring emotional independence from parents and other adults. 
Achieving assurance of economic independence. 

Selecting and preparing for an occupation. 

Developing intellectual skills and concepts necessary for civic 


competence. 
Desiring and achieving socially responsible behavior. 


. Preparing for marriage and family life. 


Building conscious values in harmony with an adequate scien- 
tific world picture.* 


3 Adapted from Robert J. Havighurst, Developmental Tasks and Education (New 
York: Longmans, Green and Co., 1951). 


386 EVALUATING STUDENT PROGRESS 


4. Know and understand the significant scientific facts that re- 
searchers have learned concerning human growth, development, learn- 
ing, and behavior. Such facts will enable the teacher more easily to visu- 
alize the student functioning as an indivisible whole, with all aspects of 
his growth, development, learning, feeling, and behaving interdependent 
and interacting. 


5. Not make judgments or draw conclusions concerning any student 
until and unless he has data about that student which are— 


a. Valid, i.e., tell the truth. 
b. Complete and comprehensive, i.e., tell the whole truth. 
c. Reliable and objective, i.e., tell nothing but the truth. 


After all essential data are assembled, it is possible to make tenta- 
tive conclusions, subject to revision if and when additional facts are 
obtained or conditions change so as to make original data inaccur- 
ate, misleading, or obsolete. 


THE TEACHER CONTRIBUTES TO GUIDANCE 


The guidance program is people—students, teachers, administra- 
tors, parents, community. The most important of these people are 
the teachers and the students. True, the program must have a struc- 
ture; it must be planned, but it will not—(cannot)—be any better 
or more effective than the teacher, who, in the long run, must bring 
it into the daily lives and experiences of the students. Teachers 
contribute so much to the guidance program that it is difficult to 
attempt an enumeration of specific contributions. Yet it seems 
necessary to list the following, at least, as some of the major ways 
in which the teacher serves in a guidance capacity : 


1. Setting a good example in appearance, language, behavior, prob- 
lem-solving, making decisions, and choices 
2. Assisting in the testing program by— 
a. Administering standardized group tests 
b. Scoring the tests 
c. Recording the results 
3. Contributing to the cumulative record — 
a. Anecdotal records 
b. Observations 
c. Other items as they arise 


193 


14. 


Oe RESI Cohn 


GUIDING STUDENT PROGRESS 387 


Conducting case studies 

Participating in case conferences 

Conducting interviews and conferences with students and parents 
Providing grades for students 

Obtaining data by means of questionnaires, checklists, personal 
data sheets. 

Reporting symptoms of pupil maladjustments to proper author- 
ities ' 

Providing opportunities for students to exercise choice and to 
make decisions 


. Teaching students ow to solve problems and make choices and 


decisions on basis of fact and evidence 


. Considering behavior as caused and discipline as education rather 


than as punishment 

Taking every opportunity to show the vocational implications of 
subjects and courses 

Providing opportunities for all students to broaden their under- 
standing and their perspective of the world in which they live, 
and to develop creative and inquiring minds capable of coping 
with the complex problems of today's world 


CHAPTER 
18 


Reporting Pupil Progress 


ReEporTS TO PARENTS have caused more than their share of headaches 
for parents, teachers, and administrators, have resulted in an abun- 
dance of cartoons in popular magazines, and probably have been 
responsible for many student mental health problems. Yet, even 
though this is the case, the chances of abolishing the existing report- 
ing systems are meager. Although schools have long been search- 
ing for better means of relaying their messages home to parents and 
Some have succeeded in doing so, the majority still rely upon the 
traditional report cards that contain percentage or letter symbols. 

To understand why so much time and effort are devoted to report- 
ing systems, it is necessary to understand the purposes of a good 
reporting system. It is possible to identify at least six major pur- 
poses that can be served by an adequate reporting system : 

1. Reports provide for a periodic and systematic review of student 

growth. 


2. Reports inform parents of the progress that their children are 
making in the schools. 


3. Reports provide students with information about their progress in 
the schools. 


4. Reports are used to secure information for administrative purposes. 
5. Reports are used to collect information for guidance purposes. 
6. Reports are used to provide information for promotional purposes. 


Each of these purposes represents a major responsibility of the 
school system that cannot be ignored. 
Schools operate as an agency of government, responsible to the 


388 


REPORTING PUPIL PROGRESS 389 


citizenry that supports them. They are not operated for the benefit 
of the administrators or the teachers, but are open because society 
believes that they are necessary for its existence. Society, whether in 
the abstract sense or with regard to the specific citizens living in the 
Clarion, Iowa Independent School District, is entitled to know what 
is taking place in the schools. While there are many means of re- 
porting to*a community, the report card or one of its derivatives 
(letters, conferences, and so on), is one of the key methods of in- 
forming parents about the progress of their children. If the reporting 
system does not do an adequate job of reflecting the school’s pro- 
gram, then the system needs to be changed, not abolished. 

Reporting systems, if they are to provide the community with the 
information that is wanted, can be most successful if conducted on 
a regular basis. Just as an accounting system supplies a business 
firm with monthly and yearly statements, so the reporting system ` 
used by a school system supplies the teachers, students, and parents 
with regular statements of progress. Such periodic and systematic 
reviews require that school personnel plan their activities for the 
school year to provide for valid measurement and evaluation of 
student growth, and for accurate and meaningful reporting of these 
evaluations. 

It needs to be recognized that a report card, whatever its form, 
is only one part of a complete record system. A school needs to 
record and retain much more information about children than can 
be efficiently reported to parents. Evidences regarding the child's 
abilities, interests, needs, achievements, and problems are needed for 
planning the curriculum, evaluating learning experiences and ma- 
terials, guiding individuals, and providing information for later edu- 
cational and occupational placement. These records will include a 
background of family and health information, test data, anecdotal 
records, comments by teachers, summaries of questionnaires and 
interviews, and the like. They are usually more complete and more 
precise than most reports to parents need to be. 


REPORT CARDS AND GRADES 


The primary means of reporting to parents has been and still is 
the report card. While many modifications of the basic form have 
been made for most students and parents, Figure 30 still represents 


EVALUATING STUDENT PROGRESS 


HIGH SCHOOL 


Class Year 10 
E ETRETRE] ER 

E EE 
A gebra 
Geometry 
English Bi Baa ee 
Latin pera ees nene owl 
EES ETE EG ERI ae 


ETT 
History E suu Ends EST] 
BEAR ESI PERLES EST 


por perpe] 
Eur ENT | 
beet en Eg Sa MI 
FE 
Times Tardy HRS ia ial 
Days Excused Absence [eme [E E] Fd ES] i 
Days Unexcused Absence ES aset e 


(n ig ERR 
Days Truancy CEES PS Be 
HH = 


Effort 
Conduct 


SYSTEM OF GRADING 
A—Highest oe Ph B—Above Average. 85-94; gm iD 75-84; 
D—Below Average, 60-74; F—Failure, Below 6! 

Excused eee When cause was unavoidable ana beyond 
one's contro 

SAT Absence: When cause was avoidable yet there 
appears a reasonable excuse. 

Truancy: A deliberate absence without sufficient cause. 


Fig. 30. Conventional report card 


REPORTING PUPIL PROGRESS ` 391 


the report card. During the past twenty years, however, many inno- 
vations in the field of reporting have been developed. Among the 
new reporting systems are the parent-teacher conference, informal 
letters, rating scales and checklists, and many variations and combi- 
natiohs of each of these techniques. In a later part of this chapter 
an analysis and examples of new report forms will be presented. 

The conventional report card with its use of a symbolic scale, such 
as A, B, C, D, and F, has had to stand its ground against an ava- 
lanche of criticism. Arthur E. Traxler has pointedly summarized 
the criticisms that have been brought against the five-point scale 
which is now being used by a majority of the secondary schools of 
the United States. 


There are three main arguments against the use of marks. The first 
has its origin in research; the second results from logical inference, sup- 
ported by experience; and the third arises from mental hygiene. 

Beginning with the work of Starch and Elliott about 1912, many 
studies have been concerned with the reliability of school marks. The 
evidence of those studies is so well known that it does not need to be 
presented here. Suffice to say that Starch and Elliott, Wood, Ruch, and 
others years ago showed convincingly that the reliability of the ordinary 
school marks based on the traditional essay-type examinations are too 
low to satisfy the criteria for individual guidance. Even for examinations 
three hours in length, the reliability coefficients usually fall within the 
range .60 to .80 It is true that the reliability of marks can be improved 
by basing them on the results of objective tests, or by using procedures 
to objectify the judgments of the instructors, but all too few schools have 
as yet made a serious attempt to place the grading of their pupils on a 
more objective basis. 

The second objection of marking systems arises out of the observation 
that marks are general statements of achievement, whereas specific state- 
ments are needed in guidance. It may be stated syllogistically as follows: 
General statements about pupils are of limited value in a guidance pro- 
gram. Marks are very general summary statements involving a multi- 
tude of unanalyzed variables. Therefore, marks have limited guidance 
value. The validity of the criticism is at once apparent to anyone who 
conducts a case study of a pupil whose marks are unsatisfactory. One 
is immediately confronted with the problem of collecting a variety of 
specific facts about the pupil’s achievement which will give the marks 
meaning, reveal sources of the difficulty, and provide leads for intelligent 
remedial treatment. 


392 EVALUATING STUDENT PROGRESS 


The third argument against marking is the one most frequently ad- 
vanced by Progressive schools and is also the one which is most vigor- 
ously debated. This objection to marking is that assignment of marks 
causes pupils to compare themselves with each other, and leads to an 
unwholesome state of competition in which the less able pupils are pre- 
destined to lose and to develop feelings of frustration, inferiority, and 
inward rebellion. It may be pointed out in reply to this argument that 
ability to obtain high marks is just one of many ways in which pupils 
vary and that the elimination of marking will not thereby create a 
Utopian institution in which all pupils work together on a basis of com- 
plete equality. It may also be insisted that children who experience 
complete frustration because of low marks and who are unable to find a 
compensatory quality about which to achieve an integration of person- 
ality are already mentally and emotionally sick, and are in need of 
special therapeutic treatment and guidance. Regardless of opinion on 
this point, the first two arguments against marking are sufficient to give 
any school pause if it is committed to a practice of indiscriminate mark- 
ing based on nothing more tangible than teacher opinion and general 
appraisal on a subjective basis. Marks reach their greatest value when 
they are supported by objective data and when they provide information 
concerning the specific strengths and weaknesses of students.' 


As a result of research-oriented criticisms and of teacher and 
parent dissatisfaction with existing grading and reporting systems, 
some major changes in the area of pupil reporting have taken place. 
Yet in spite of the multitudinous objections to school marks and the 
equally multitudinous suggestions for improvement, they still con- 
tinue to be used widely. 

Improvement in grading practices appears to be essential if some 
of the limitations of existing report cards are to be minimized. Har- 
ris, reporting on a study of grading, promoting, and reporting prac- 
tices in Kentucky, offers the following as characteristics of good 
grading practices: Good grading practices— 


a. Are worked out cooperatively by all who are concerned with them. 

b. May never require that evaluation be reduced to symbols. 

c. Provide for pupil participation according to the level of develop- 
ment of the pupil. 


1 Arthur E. Traxler, Techniques of Guidance (New York: Harper & Brothers, 
1945), pp. 238-39. 


REPORTING PUPIL PROGRESS í 393 


d. Involve a wide variety of procedure; they are not stereotyped. 

e. Are essentially positive in nature. 

f. Involve the use of value judgments arrived at in a democratic at- 
mosphere. 

g. Are openly used and openly arrived at; there is no secret keeping 
of grades. 

h: Do not emphasize formulas or distributions. 

i Involve a minimum of clerical records. (When proper data are col- 
lected, there is probably no need for records of elementary school 

. grades.) 

`j. Are not necessarily tied to a rigid schedule. 

k. Accurately reflect all of the values of the school. 

1. Emphasize the total development of the child. 

m. Except for value judgments, involve only validated scores and 
marks. 

n. Provide for individual differences without emphasizing them; and 

o. Do not involve the use of fear or threats. 


Grading practices are improved when the characteristics of the 
various grade symbols are carefully defined. The following are 
suggested criteria for defining letter grades: 


The grade of A means— 

1. Objectives of the course are achieved. 

2. Instructor has no reservations about the student's level of achieve- 
ment. 

3. Student is prepared for high-quality advanced work in the field. 

4. Student is highly competent in the application of his learning in 
practical situations where it is applicable. 


The grade of B means— 

1. Objectives of the course are achieved. 

2. Instructor has minor reservations about the student's level of 
achievement. 

3. Student is prepared for above-average quality of advanced work in 
the field. 

4. Student is competent in application of his learning in practical 
situations where it is applicable. 


? Fred E. Harris, “Three Persistent Educational Problems: Grading, Promoting, 
:and Reporting to Parents," Bulletin of the Bureau of School Service, XXVI (Lex- 
‘ington, Kentucky: University of Kentucky, September, 1953), p. 32. 


394 EVALUATING STUDENT PROGRESS 


The grade of C means— 
1. Objectives of the course achieved at a level which the instructor 
regards as minimum preparation for advanced work in the field. 


AND/OR $ 


2. Student has average ability to apply his learning to practical situ- 
ations where it should be applied. 7 

The grade of D means— 

1. Objectives of the course achieved at a level which the instructor 
regards as submarginal as preparation for advanced work in the 
field. 

AND/OR 


2. Student has low ability to apply his learning in practical situations 1 
where it should be applied, and little learning to apply. 

The grade of F means— 

1. Objectives of the course not achieved at a level which the instructor 
regards as the minimum preparation for advanced work in the — 
field, i.e., the student should repeat the course if he plans to take - 
further work, 


AND/OR 


2: The student has not shown any significant learning from the course 
that is applicable to practical situations. ` 


OR 


3. The student has failed to complete the course without prearranging f 
with his instructor for an incomplete. 


OR 


4. The student withdraws from the course after the last date for 
dropping a course without academic penalty. 


The search for better methods of reporting has taken individuals 
and schools along many different roads and in many different direc- 
tions. The search still continues. Since the early schools were con- 
cerned almost exclusively with academic achievement, numerical 
or percentage grades were given in all subject matter areas. As the - 
school's vision of its role unfolded, attention was given to traits of — 


REPORTING PUPIL PROGRESS 395 


citizenship and personality. Along with the growing attention to 
nonacademic areas of student growth came the recognition of the 
limitations of the numerical and percentage systems of grading and 
various scales were introduced to overcome the existing limitations. 
The variations ranged from five- and six-point scales to a two-point 
scale of P for Passing and U for Unsatisfactory. While the use of 
the symbolic scale represented a major change in the pattern of 
grading, it did not solve the problem of how to report personality 
characteristics effectively with a grade of A, B, C, D, or F. To meet 
this difficulty dual systems came into use, so that academic achieve- 
ment could be graded using a symbolic scale, and citizenship traits 
and personality characteristics could be evaluated by means of check- 
lists. These modifications of the traditional report card did not, how- 
ever, really meet the major criticisms which had been raised. Other 
efforts, however, resulted in the development of the parent-teacher 
conference as a method of reporting, the use of informal letters, and 
the gradual evolution of reporting systems designed to appraise 
student progress toward the objectives of education. Along with these 
more recent developments, attention has been turned to student self- 
evaluation and to parental reporting back to the teacher. 

In the pages that follow, examples of recently developed reporting 
forms are presented. They are not intended simply as models to be 
copied, but are presented as examples of what teachers and schools 
can do if they have the desire to develop a more adequate system 
of reporting. 

At the University of Chicago Laboratory School the staff for many 
years has engaged in a continuous study of reporting. They have 
used a variety of techniques including parent-teacher conferences, 
informal letters, and checklists. In some instances, teachers have 
designed their own reporting forms. One of the basic goals of the 
staff has been to develop a method of reporting the contributions 
of each class to the over-all objectives of the school, as well as to 
the various subject-fields. To accomplish this they designed the re- 
porting forms that are shown as Figures 31, 32, 33, 34, and 35. Each 
report has two basic divisions; a section in which the teacher evalu- 
ates the student’s progress toward the all-school objectives and a 
second section on which specific objectives toward classroom goals 
are appraised. It is to be noted that the objectives were determined 


In working toward attainment of all-school objectives the 
student- 


Partially 
Satisfactory 


|| Satisfactory 


1. Assumes responsibility for personal growth: 
(a) By making effective use of intellectual ability, (b) By 
making an effort to develop character as exemplified in 
such qualities as personal integrity, dependability, 
rseverance, courtesy, self-control, and self-reliance, 
c) By accepting constructive criticism. 


2. Exercises appropriate emotional control. 


3. Assists in orderly and effective functioning of school 
groups: 
(a) By abiding by school rules and customs, (b) By 
showing sensitivity to the needs of individuals and 
groups, (c) By accepting group decisions, (d) By 

rforming official duties effectively, (e) By respect- 

ing property, (f) By serving voluntarily as a leader 
Forbes follower. 


4. Practices desirable habits of health and safety. 


5, Demonstrates appropriate self-direction and 
persistence: 


(a) By recognizing points within an area on which he 
needs improvement, (b) By working toward improve- 
ment according to ability, experience, and available 
resources, 


6. Uses time wisely: 


(a) ey. planning effective use of available time, (b) By 
bringing required materials, (c) By assembling equip- 
ment before starting a project, (d) By starting work 
promptly, (e) By meeting deadlines set by himself or 
others, (f) By cleaning up and putting equipment away. 


c 


Shows ability to listen effectively. 


8. Shows ability to read effectively. 


9. Shows ability to speak effectively: 


(a) By providing adequate and accurate content, (b) By 
organizing thought, (c) By observing correct pronuncia- 
tion, enunciation, speed, and tone. 


10. Shows ability to write effectively: 
(a) By providing adequate and accurate content, (b) By 
organizing thought, (c) By observing conventions in 
Spelling; punctuation, usage, and handwriting. 


Comments, if any: | 


Fig. 31. Section of report form used at the University of Chicago 


Laboratory School (This portion was completed i 
by all teachers) ; 

396 i 

f 


Subject: ART STUDIO 


Fi 


Grade Level 


In working toward the attainment of the 
objectives of the course the student: 


1. Demonstrates special ability in one field: 


Partially 
Satisfactory 
ER Unsatisfactory 


looms 
Pleas, 


a) By improving skills in, 

e By tmderstamding the works of others in this field. 

2. Demonstrates special abilities in several fields: Lp pou 
(a) By improving skills in. 1... ————— 


(b) By understanding the works 


ee UNUS 
of others in these fields. 


3. Demonstrates inventive and imaginative qualities in 
products produced: 


(a) By re-interpreting and readapting the ideas of 
ea: (b) By using own inventive and imaginative 
as. 


4, Produces work of quality: KEAR 


(a) By meeting own standards, (b) By soliciting con- 
structive criticism, (c) By making constant effort to 
improve, (d) By completing and evaluating own products, 
5. Demonstrates an active interest in the work of others: BEREE 
(a) By discussing objectively with others, (b) B; 
ongenially with 2o 


working c ly others, (c) By giving con- 
structive suggestions and pee Oey 


6. Demonstrates an understanding of some basic art 
principles: 


(a) By using evaluative guides in judging own or others! 
MEL E effectively aee en ideas, thoughts, 

feelings to others through products produced, (c) By 

understanding the work of own contemporaries, 
7. Demonstrates an understanding of human rights: [137] 
By accepting the ideas and products of others, 


By helping others willingly, (c) By accepting 
rs as equals, 


Comment, if any: 


Advisory Grade ome Instructor 


Fig. 32. Art section of report form used at the University of 
Chicago Laboratory School 


397 


Subject: SOCIAL STUDIES Name 
Grade Level Date 


i 
E] 
E 


In working toward attainment of the objectives 
of the course the student demonstrates: 


1, Accumulation of basic information through research: 
By reading widely ‘and Selectively for factual back- 
ground standing, 


asa basis for unde: 


2. Mastery of basic information as measured by tests, 
gcuan and rimp weenie can 


y correctly re ig and using information acquired 
through research, 


3. Wiliogees and ability to see penses [ | | | ] 
ly being relatively free from prejudice, 
4, Ability to attack adsa solve problems: Sas Bs 
ly seel ues cl relevant facts to 


bar upon issues, and weighing alternatives for a 


solution, 

5. Ability to use appropriate research ues: ea ms] 
By knowin, Tow to use library: materi maps, 
charts graphs, 

6. Knowledge and understa; of current affairs: Jin e ed d 
By ed eA iod EE S , listening t 
bread and un se at his ibd 
level, 


1. A sense and use of historical perspective in current EE Beal ied 
airs: 


By Seeing aj jate parallels, contrasts, and 
relations mdr gen vns 


8. 'si indi effective use of social studies 


vocal 
By havi ving ropríate concepts of abstract words, 
red + Political, economic, social, 


. jd A and ate content 
9. Qal skills: ea (a) Adequate accurate cont 


class discussion (b) Organization 
and formal reports 


(c) Presentation 

10. Written skills: (a) Adequate and accurate content 

S measure, 
dri reports (b) Organization 
sis 

(c) Mechanics 

Comments, if any: 

Advisory Grade Instructor 


Fig. 33. Social studies section of the report form used at the 
University of Chicago Laboratory School 


398 


Subject: MUSIC 
Grade Level 


In working toward attainment of the 
objectives of the course the student: 


Knows lines, spaces, and clef. 

Understands notes, rests, and the values. 
Knows the piano keys. 

‘Understands keys and key signatures. 

Knows musical terms and signs. 

Knows elementary musical form. 

Has ability to read music at sight. 

Has ability to memorize music. 

Performs with good intonation. 

Can take rhythmic dictation. 

Can take tonal dictation, 

Is acquainted with composers and nationalities, 
Knows the instruments by sight 

Has made progress in discriminative listening. 


Can identify musical phrases from memory. 
Is making progress in musical interpretation. 
Has good enunciation and pronunciation in singing 


Has independence in part singing. 

Knows syllable names and their uses, 

Has made a good notebook. 

Has progressed in the technique of his instrument. 
Has improved in student directing. 


Comments, if any: 


Instructor 


Advisory Grade__ 


Fig. 34. Music section of the report form used at the University 
of Chicago Laboratory School 


399 


Subject: ENGLISH Name 


Grade Level — . Date 


In working toward attainment of the objectives 
of the course the student: 


1, Provides adequate and accurate content in: writing 


BERETS 


ER F 


speaking 
2. Shows ability to organize materials: oral 
written 


3, Demonstrates effective speech habits 


4, Observes conventions in: spelling 
punctuation 
usage 
handwriting 


5, Shows ability to apply concepts and principles of the 
course 


6. Uses appropriate research techniques 
7, Shows understanding of literature of the course 


B. Shows range of reading interests 


EE 


9, Shows ability to express ideas creatively 


Comments, if any: 


Advisory Grade Instructor 


Fig. 35. English section of the report form used at the University 
of Chicago Laboratory School 


400 


REPORT TO STUDENTS AND PARENTS 
RICH TOWNSHIP HIGH SCHOOL 
DISTRICT #227 
PARK FOREST, ILLINOIS 


Name. . 
Subject. . 


Term beginning September .., ending June . .., 19..... 


GRADING SYSTEM 
Aw EXCELLENT D- POOR + MARK indicates unusually satisfactory progress 
B= GOOD Fe FAILURE NO MARK indicates reasonable progress 
` Cm FAIR INC= INCOMPLETE — MARK indicates need for improvement 
Tini Second Third Fourth, Final 
Quarter Quarter Quarter Quarter _Grade 
INDIVIDUAL PERFORMANCE 
Works up to ability. (continually tries) 
Has a good attitude 
‘Shows self-direction 
Plans work wisely 
Knows when and how to seek help 


SCHOOL CITIZENSHIP 


Cooperates with group 

Respects the rights and foelings of others 

Contributes his share 

Shows leadership 

Takes care of school and personal property 
SUBJECT MATTER KNOWLEDGE AND SKILL 

Learns factual matter and skills 

‘Applies information to new situations. 

Reads with ease and comprehension. 

Completes assignments 

Recites effectively 


Pct ————À| 


‘Scores satisfactorily on tests and exams 
E ———————————| 


TOTAL GROWTH AND PERFORMANCE 


Fig. 36. From section of report form used by Rich Township High School, 
Park Forest, Ill. 


Evaluation of 


Grade in Subject skills related to Subject 
A 93 — 100 Excellent 
é a $ ues if Fair S / Satisfactory 
5— 80. 'oor T 
F below 75 Failure N / Needs to show improvement 


u Hu IV 


1. Understands and uses correct grammar. 

2. Shows ability in Reading —— 

3. Develops style and technique in composition. 
4. Develops oral expression. 

1. Has a chronological concept of events. 

2. Shows ability in reading comprehension. 


3. Relates past events with present and future conditions. 


4. Demonstrates understanding of world setiontiee in class discussion. 
Business Education 


1, Ability to use the skills learned. 
2. Develops appropriate speed with appropriate accuracy (for individual). 
3. Develops satisfactory vocabulary and mastery of common words. 


4, Practices good working habits. 


Social Problems 


1, Demonstrates ability in q of social problems. 
2, Applies critical thinking in solving problems. 
3. Participates in class discussion and in oral reports. 


4, Written skills as measured > essays, IE and tests. 


ee” 


Fig. 37. Adapted from pages 2 and 3 of report form of Sheboygan 
Falls, Wis. 


402 


REPORTING PUPIL PROGRESS 403 


by cooperative action, that there is room for additional teacher com- 
ment; and that for transfer and record purposes an advisory grade 
can be given. The student is appraised on the basis of individual 
growth toward a set of group goals. 

Ruth Strang in a bulletin, “How to Report Pupil Progress,” shows 
the report forms of the Rich Township High School in Park Forest, 
Illinois, and the Sheboygan Falls High School, Sheboygan, Wiscon- 
sin, as examples of new developments in the area. “The Rich Town- 
ship High School [Fig. 36] issues a separate card for each subject. 
The A, B, C, D, or F grade representing total performance and 
growth of the student is broken down into three separate areas: 
individual performance, school citizenship, and subject matter 
knowledge and skill. The grades in these three areas are averaged 
to determine the total performance and growth grade. Brief check- 
lists under each of the three letter-grade areas permit indicating 
commendable performance, special need for improvement, or sig- 
nificance change.” ê 

The Sheboygan report, Figure 37, combines the checklist with 
the rating. “The checklist headed, ‘Some typical factors that con- 
tribute to the success of students,’ includes more than thirty items; 
but because it would be impossible to obtain information about each 
item on the list, teachers check only those items they feel competent 
to appraise.” The reproduction shows how one student was rated in 
subject matter.* 


TEACHER IMPROVEMENT OF REPORTING PRACTICES 


One must face the fact that many of the reporting systems now 
in use neither meet the criteria listed earlier nor resemble the sample 
cards presented in the preceding section. What then can teachers 
using a traditional reporting system do to improve their practices? 
One of the most promising and sound approaches would be for 
teachers to engage in a self-study project in cooperation with ad- 
ministrators, parents, and students to discover better ways of report- 
ing. This practice of self-study has resulted in many of the current 
reporting innovations, for, as teachers sought to find answers to 


3 Ruth Strang, “How to Report Pupil Progress” (Chicago: Science Research Asso- 
ciates, 1955), p. 41. 
* Ibid., p. 37. 


404 EVALUATING STUDENT PROGRESS 


some of their problems, they were able to come up with usable solu- 
tions. Unfortunately, not in all teaching situations will the teacher 
be in a position to engage in cooperative self-study and individual 
action is the only alternative. 

Confronted by the necessity of doing an adequate job of report- 
ing to parents and students the teacher can, regardless of the system 
in use, improve his reporting techniques. Specifically he can— 

1. Recognize that a report card is only one element in a complete 
reporting system. The report card has often come to represent the 
means of reporting, whereas it is only one element in a system of 
reporting, which includes letters, cards, conferences, and rating 
scales. In turn, the system of reporting should be only one element 
of a carefully maintained system of records—anecdotal records, re- 
ports, test profiles, papers, and cumulative folders. If the teacher 
understands the place of report cards in this total pattern, then they 
assume their proper significance. The emotionally charged atmos- 
phere created in classrooms around report card time can be mini- 
mized when teachers and students see the card as but one phase of 
a continuous process. 

2. Precisely identify classroom objectives so that student growth 
can be evaluated as progress toward clearly defined goals. A great 
deal has already been said about the need for identifying classroom 
objectives so that evaluation can properly take place. Since the re- 
port card is one element of the total evaluation pattern, it would 
seem reasonable that reporting should be concerned with student 
progress in the direction of specific goals. The controversy over re- 
porting student behavior on a classroom-competitive or self-com- 
petitive basis can be relegated to another ineffective either-or argu- 
ment without major significance. 

3. Assign weights to objectives so that the various activities of the 
classroom are kept in their proper perspective. Not all the things 
that are done in the classroom are of equal importance nor do they 
deserve equal weight. This problem assumes a great deal of im- 
portance when the teacher is working on a system where a single 
symbol (A, B, C, D, F) is used to grade a student and where a 
multiplicity of objectives exists. The teacher of English in a com- 
munication course might, in part, have as unit objectives the fol- 
lowing: 


REPORTING PUPIL PROGRESS 405 


Students should— 

a. Develop the ability to express ideas clearly. 

b. Acquire an understanding of the grammatical structure of sen- 
tences. 

c." Develop the ability to diagram sentences. 

d. Acquire an understanding of the parts of speech. 


Each of the objectives may be of importance but certainly not 
equally so. In reporting, the teacher would need to determine the 
relative emphasis to be placed on each of the objectives and then 
tó appraise the progress of the student. In much the same way, 
teachers need to consider such problems as attendance, participa- 
tion in classroom activities, cooperativeness, promptness, and the 
many other elements entering into the work of the classroom. 

4. Maintain a folder for each student containing illustrative sam- 
ples of student behavior and performance. All individuals are prone 
to forget that which took place six weeks ago and to remember the 
more recent events. Since report cards are supposed to represent 
efforts over a period of time, it is necessary that teachers collect and 
maintain evidence which will enable them to do an effective job of 
reporting. One of the easiest ways of handling this problem is to 
prepare a folder for each student in the classroom. In this folder 
can be deposited student papers, attendance slips, special reports, 
records of observations, rating scales, test results, and all other evi- 
dence that might be useful at reporting time. In Chapter 16, a de- 
tailed plan for securing and maintaining continuous information 
about students was described as part of the educational profile. 

5. Help students understand what the report card is and the basis 
upon which the evaluation is made. Teachers might profitably de- 
vote time at the beginning of each new year, semester, and/or unit 
to a carefully prepared discussion of the objectives of the forthcom- 
ing portion of the classroom activities. This discussion should help 
students understand the objectives and also provide them with in- 
formation as to the techniques the teacher expects to use to evaluate 
and rate them. Students are often concerned by teacher demonstra- 
tions of favoritism, so it is to the teacher’s advantage to define as 
objectively as possible the basis upon which grades are to be as- 
signed. Students are as anxious to know what is happening in a 
classroom as are the teachers, and efforts made to increase their 


406 EVALUATING STUDENT PROGRESS 


understanding of the educational process often pay extra dividends 
in the form of student interest and cooperation. 

6. Supplement the letter or numerical grade with comments and 
special parent and/or student conferences. It isn’t by mere chance 
that a strong movement has developed to replace or supplement the 
regular report card with informal letters and conferences. As parents 
become more and more interested in their child’s education and as 
teachers become more and more dissatisfied with the existing report- 
ing forms, it was inevitable that informal means for sharing infor- 
mation would evolve. One means of informal communication is for 
the teacher to include a brief comment about a student on the regu- 
lar report card. While this does not provide for an exchange of ideas 
between parents and teachers, it does serve the useful purpose of in- 
forming parents that the teacher has something special to say about 
their youngster. 

A more profitable form of reporting is through the use of informal 
notes or letters as the appropriate occasion arises. Teachers need to 
recognize that letterwriting is a skill that requires the development 
of basic communication competencies. Letters or notes should be 
written not only when a student is in difficulty but also to inform 
parents of positive contributions made by their youngsters. Too 
often the only direct contact the school has with parents is of a 
negative nature—" Johnny is failing in Algebra,” “Susan was truant 
from classes last week,” and “Richard is causing us trouble.” The 
teacher through informal notes can help parents appreciate the con- 
tributions public education is making to student growth. 

One of the recent developments in the area of reporting has been 
the evolution of parent-teacher conferences as the primary method 
of informing parents of the progress of their children. The move- 
ment has been well received in the primary-grade levels and is 
gradually moving up the educational ladder. Parent conferences 
with teachers are not really new, for teachers have always consulted 
with parents about their children. What makes the present develop- 
ment new is the systematic attempt to talk to the parents of all 
children rather than just the parents who either come to the school 
on their own or because they are called for behavior or academic 
difficulties. Reporting to parents has now become, in some school 
systems, the heart of a program of school-community relationships. 


REPORTING PUPIL PROGRESS 407 


Teachers engaged in parent conferences would do well to review 
the interviewing techniques described in Chapter 10, for the prin- 
ciples and techniques developed in that section are also applicable to 
parent-teacher conferences. Some specific suggestions for conducting 
parent-teacher conferences are as follows: 


a. Plan.in advance for the conference. 

b. Schedule the conference at the best time for you and the parents 
and in the most pleasant location. 

.c. Have conference materials ready and use tangible evidence of stu- 
dent achievements, progress, and problems. 

d. Establish the best rapport that you can by emphasizing the positive 
aspects of student behavior (certainly in the early stages of the 
conference, at least). 

e. Interpret technical data (M.A., raw test scores, grade equivalents) 
in terms that are understood by the parents. 

f. Keep the conference “on-the-ball,” don't let it disintegrate into a 
“gabfest,” or an exchange of compliments. ; 

g. Attempt to make the conference a two-way process, with a sharing 
of information by both parents and teachers. 

h. Follow up the conference by recording what has been learned and 
supply parents with additional information as needed. 


7. Offer students ample opportunity to engage in the process of 
self-appraisal. The point has frequently been made in the preceding 
chapters that one function of a complete program of evaluation is to 
provide students with information so that they can more adequately 
engage in the process of self-evaluation. One of the best opportuni- 
ties available to teachers to engage in this process is during a re- 
porting period. Since reporting represents an “inventorying,” it 
should be participated in by the students as well as the teachers. 
An example of how this might be done is presented in Figures 38 and 
39. Two weeks prior to the reporting day, the teachers using Form A 
have their students review their accomplishments. The comments 
made by the students and the observations by the teachers are re- 
corded on Form B, Figure 39, which is then sent home to the parents. 
When the student’s evaluation and the teacher’s evaluation do not 
coincide too well, the teacher and student meet and talk about the 
differences. This technique can be adapted by a teacher to meet the 
unique situations in each school system. The material need not be 


Fig. 38. Sample self-reporting form 


FORM A 
Core Checklist 
Name Date. 


A. Effectiveness in oral expression 


1. Am I well informed on my subject 

when I speak? 

_____ 9. Do I organize my oral expression So 
that it is simple, direct, and 
straightforward? 

.  —— &. Do I speak distinctly and audibly? 

. .  . 4. Am I feeling more comfortable when 
speaking to a group? 

—— 5. Am I increasing my participation 

in group discussions? 


Comments: 


B. Effectiveness in written expression 


1. Do I use correct manuscript form in 
my written work? 

2. Is my written work neat and attrac- 
tive in appearance? 

3. Do I spell words correctly? Am I 
improving in my ability to spell? 

4. Do I write interesting and clear 
Sentences? 

5. Do I organize my writing so that it 
is simple, direct, and straight- 
forward? 

6. Is the content of my written work 
adequate? 

7. Do I proofread my written work 
carefully and eliminate careless 
errors? 


Comments : 


408 


Fig. 38. Sample self-reporting form (cont.) 
QC. Effectiveness in reading 


1. Am I able to grasp the main points 
of factual material I read? 

. —.—— 2. Am I able to understand informa- 

tion in graphs and charts and maps? 

. $8. Am I reading as many good books of 

. various kinds as I should? 

4. Am I increasing my understanding 
of difficult words? 


Comments: 


D. Understanding of social studies material 


1. Am I becoming more familiar with 
the story of the American people? 

2. Am I increasing my understanding of 
how the past is related to present 
day problems? 

3. Am I becoming more interested in 
and better able to understand cur- 
rent affairs? 


Comments: 


E. Effectiveness in group relations 


1. Am I conscious of other people's 
rights and privileges as well as my 
own? 

Am I thoughtful and considerate of 

other people's feelings? 

— —. 3. Do I attempt to contribute at all 

times to group progress? 

. 4. Do I listen to other people's com- 
ments and opinions and attempt to 
understand them? 

5. Do I take my share of responsi- 
bility in committee work? 


LEE 


Comments: 


MEMCNINE 


MENT 


Fig. 38. Sample self-reporting form (cont.) 
F. Effectiveness in work habits 


1. Am I honestly trying to overcome 
weaknesses and difficulties? 

2. Do I have a good system for going 
about my work? 

3. Am I prompt in completing work? 

4. Am I able to locate information 
quickly and efficiently? 

5. Do I attempt to concentrate on the 
business at hand in spite of inter- 
ruptions? 

6. Am I thorough in my work? 

7. Do I make sure I understand printed 
or oral directions? 

8. Am I improving in doing my work 
without asking for help? 


Comments: 


Oe 


Teacher 


410 


Fig. 39. Student-teacher reporting form 


FORM B 


Semester Report--Social Studies--English 7 


Student. 


Date 


—— 


PURPOSES 


Effectiveness 


in oral 
expression 


PUPIL'S 
EVALUATION 
OF HIMSELF 


Effectiveness 
in written 
expression 


TEACHER'S 
EVALUATION 
OF THE PUPIL 


Effectiveness 
in reading 


Understanding 


of social 
studies 
material 


Effectiveness 
in group 
relations 


Effectiveness 
in work 
habits 


ADDITIONAL PUPIL 


COMMENT : 


ADDITIONAL TEACHER 


COMMENT : 


4n 


412 EVALUATING STUDENT PROGRESS 


sent home but might be used exclusively for student self-appraisal 
and for student-teacher conferences. 

8. Prepare a supplementary report card (ditto machine or mimeo- 
graph) that can be used exclusively for the individual classroom. 
There is no reason why a classroom teacher wishing to improve his 
reporting practices could not develop a report card similar to the 
ones shown in this chapter or prepared specifically to meet indi- 
vidual classroom needs. The teacher-made card can be prepared in- 
expensively and used for many different purposes. The chances are 
rather good that an administrator would cooperate most whole- 
heartedly with a teacher desiring to foster better home-school rela- 
tionships. 

Regardless of the method used to report to parents, the system 
should reflect and reinforce the basic purposes of the school. In the 
modern school, with its broad scope of interests and developmental 
objectives, the traditional report card needs to be replaced because 
it does not tell enough or do enough. Reporting should build a child’s 
confidence and reinforce parent-child relationships, for there is 
nothing to be gained by giving the child a false evaluation of him- 
self or setting the parent against the child. To develop a sound sys- 
tem of reporting, sufficient time must be taken to plan and learn 
the new techniques. Everyone who is directly affected by a plan 
should help to make it, evaluate it, and revise it. 


CHAPTER 


19 


. . Evaluation and the Teaching- 


Learning Situation 


Tue END RESULT of a program of evaluation is not the accumulation 
of files of dust-gathering test results, but the improvement of the 
teaching-learning process. Tests, cumulative records, rating scales, 
interest inventories, sociograms, observations, and all of the other 
techniques that have been described in the preceding chapters are 
of little value unless they are used by teachers to assist students 
to achieve desirable educational objectives. It is far better actually to 
use fewer techniques of measurement, but to use them, than it is to 
develop a large-scale program of measurement and then dedicate the 
results to obscurity. 

A functional program of measurement and evaluation to be most 
effective needs to be continuous. All too frequently evaluation is 
considered a step in the teaching cycle that comes at the end of a 
chapter, unit, or semester. T his concept assumes that the major use 
of the tools of evaluation is to measure the end product and to pro- 
vide the teacher with information so that appropriate grades can be 
awarded. Unfortunately, this limited concept of evaluation has re- 
sulted in patterns of student behavior—cheating, dishonesty, and 
fear—that defeat the essential purposes of education. When evalua- 
tion becomes synonymous with testing and testing becomes synony- 
mous with threat, then it is almost impossible to use evaluation to 
improve the teaching-learning process. 

When the tools of measurement are used throughout the teaching- 
learning cycle, it becomes possible to minimize the cheating, dis- 


413 


414 ^ EVALUATING STUDENT PROGRESS 


honesty, and fear which so often accompany existing programs of 
end-product testing. Students can be made aware of the fact that 
the techniques of measurement are also learning devices, that the 
results of measurements are to be used to help them achieve educa- 
tional objectives, and that the frequent use of a variety of measur- 
ing instruments offers them ample opportunity for self-evaluation. 
Continuous evaluation means that the process takes place at the 
beginning of a unit of work, during the course of the unit of work, 
and at the end of the unit of work. It means that continuing oppor- 
tunities are available to the teacher and the students to appraise 
progress, and to adjust or modify their programs if necessary. A pro- 
gram of continuous evaluation is based upon the assumption that 
appropriate tools of measurement will be used whenever needed. 


EVALUATION AND CURRICULUM CHANGES 


It is common practice to talk about individual differences in the 
classroom and then to teach the same material the same way to all 
students, The absurdity of this situation is clear, but remedies to 
correct it are not always clear. One of the remedies is to engage in 
systematic evaluation to (1) find the weaknesses and strengths of 
individual students; (2) determine needed modifications in se- 
quence and emphasis; and (3) locate unnecessary duplication. 

There is no debating the fact that students are not the same in- 
tellectually, physically, socially, or emotionally. No method of group- 
ing can eliminate student differences. Even were it possible to hold 

_ one variable, such as reading achievement, constant by placing all 
students with identical reading ability in a particular section of 
English, within a relatively short period of time the differences 
within the group would again reassert themselves. All of the stu- 
dents might have the same general reading level, but there would 
be differences in vocabulary, rate, comprehension, and interest. 

Acceptance of the fact that individual differences do exist in 
classes means that teachers need to study as carefully as they can 
individual and group strengths and weaknesses. This can be done by 
reviewing the data that are available on cumulative records or in 
student files. It also can be done by using informal and standardized 
tests at the beginning of the school year or whenever new units of 
work are introduced to secure a comprehensive picture of student 


b 


EVALUATION AND THE TEACHING-LEARNING SITUATION 415 


‘achievement. Informal techniques can be used to isolate problems 
of interest, attitude, and work habits that may need to be taken into 
account during the forthcoming school year. 

Once there are sufficient data upon which judgments can be made, 
the teacher can then determine the answers to some of the following 
questions : 


1. Are there any content areas that can be eliminated from the 
course sequence for the coming semester? In the area of English 


is it might be possible to eliminate certain topics, such as outlining 


or comma punctuation, if the students seemed to have a good 
grasp of those areas. In mathematics, pretests can help the teacher 
locate starting points and eliminate review work. 

2. Are there any content areas that need to be added to the course 
sequence for the coming semester? Just as it may be possible to 
locate content areas that need to be eliminated, so it may be pos- 
sible to locate areas that should be included in the program. In 
the social studies, the teacher of American history may discover 
that the students have had very little opportunity to study the 
social and cultural history of the nation. Such information may 
cause the history teacher to include a unit on that topic in the plan 
for the semester. 

3. Are there content areas that need to receive greater emphasis? 
During the semester the science teacher may discover that the stu- 
dents, although they studied the scientific method, did not seem to 
understand it when other topics were introduced. Perhaps the sci- 
ence teacher will decide to introduce more material on the scientific 
method or perhaps he will wish to consider the possibilities of de- 
veloping a more complete unit on that topic for the next year. The 
teacher of American problems may decide that trying to cover 
eight problems a semester is too many and that the students would 
benefit more from a concentration upon four problems per semester. 

4. Is there a need for changing the sequence of the course units? The 
teacher of history might raise some questions about the desirability 
of changing from a chronological approach to a topical or problems 
approach. The science teacher might want to consider whether or 
not it would be best to start with a study of human beings or with 
single-celled animals. 

5. Is there a need for new courses to meet changing conditions? While 
most frequently the addition of new courses to the total school 
program is not the responsibility of a single teacher, he does have 


416 EVALUATING STUDENT PROGRESS 


a responsibility for making recommendations for needed additions 
to the school program. The algebra teacher might readily conclude 
that many of the students could profit from classes in general 
mathematics rather than from the regularly established mathe- 
matics sequence. The music teacher could help secure the infor- 
mation needed to establish an introductory course in music appre- 
ciation. 


Revamping of the curriculum is not one teacher’s responsibility, 
but the readjustment of the program offered within a single class- 
room is the responsibility of the teacher. Evaluation gives the 
teacher the necessary evidence to make intelligent decisions concern- 
ing changes in the program. 

While questions of curriculum modification are generally de- 
scribed for total units, it needs to be remembered that the changes 
may need to be made for individuals rather than for groups. The 
same types of questions that have been raised above for over-all 
curriculum modification can be raised for individuals: 


1. In what areas are individual students strong? 

2. In what areas are individual students weak? 

3. What modifications of the curriculum can be made to assist the 
weak students and encourage the strong students? 

4. How can an enrichment program be developed for exceptionally 
gifted students? 

5. How can remedial work be provided for students that appear to 
have ability but are doing poorly? 


If the curriculum is not considered a set pattern of courses to be 
offered each semester but a sequence of integrated learning experi- 
ences designed to foster progress toward well-defined objectives, then 
the teacher can use evaluation to foster individual growth. It must be 
remembered that a purpose of education is continuous growth for 
all students, not the same growth for all individuals. 


EVALUATION AND INSTRUCTIONAL IMPROVEMENT 


Teachers use various methods to assist students to learn facts, 
skills, attitudes, interests, and problem solving. Teacher-student 
planning, educational films, field trips, group discussions, and indi- 


EVALUATION AND THE TEACHING-LEARNING SITUATION 417 


vidual projects are means by which students learn to appreciate 
music, understand the difference between an element and a com- 
pound, acquire an interest in biography, and outline their problems. 
Many times the means by which individuals learn subject matter also 
represent things that are learned. A teacher of art may use group 
discussion to help students learn to appraise paintings and at the 
same time develop cooperation. 

The method by which a particular subject is taught is extremely 
important and some educators indicate that the method employed by 
‘the teacher may be as important, if not more important, than the 
subject matter which is being presented. It is maintained, for exam- 
ple, that the lecture-recitation method used in the study of American 
literature may produce students who can answer factual questions 
on examinations but that these students are neither interested in, 
nor appreciate, American literature. 

It is probably correct to assume that both the method employed 
by the teacher and the content studied are significant aspects of the 
teaching-learning cycle. The evaluative process not only helps teach- 
ers make some determinations about the content that they are us- 
ing but also assists them to make some decisions about the methods 
they should use in developing content areas. 

It is possible to discover many deterrents to the teaching-learning 
process and it is very important for teachers to study their instruc- 
tional methods to see in what ways they can eliminate the blocks to 
learning. Evaluation in the classroom enables the teacher to discover 
the weaknesses and strengths of his instructional methods just as it 
enables him to diagnose the weaknesses and strengths of his students. 
Evaluation provides the basis for answering the following ques- 


tions: 


1. Are the instructional methods being used the best for the objec- 
tives that are being sought? The teacher of geometry desiring to 
help students acquire the skills of logical thinking may need to 
develop problems in which such skills are demonstrated. The 
teacher of typing may need to stimulate the pupil’s interest in 
typing by means of visits to business offices instead of relying on 
drill exercises. The teacher of dramatics may want to use record- 
ings of famous actors for demonstration purposes. 


418 


EVALUATING STUDENT PROGRESS 


Are the instructional methods being used the most appropriate for 
the students? It is generally recognized that in the typical high 


_school classroom there is a reading range of six years or more; yet 


in that classroom all students use the same textbook. By evaluat- 
ing student progress the teacher can find the student's level of at- 
tainment and then whenever possible provide appropriate textbooks 
and other instructional material. Lectures may be acceptable at 
the college level, while at the junior high school level they would 
generally be inappropriate. For some students a review of the text- 
book may be all that is needed to help them understand a particu- 
lar concept; for other students discussion might be necessary; for 
still others an actual demonstration of a process may be necessary. 
Are the instructional methods being used the most efficient for 
teaching facts, concepts, attitudes, and skills? In a chemistry class 
an experiment using some expensive equipment might be dupli- 
cated by all members of the class or it might be done as a demon- 
stration by the instructor. The question needs to be raised whether 
or not it would be necessary for all students to duplicate the ex- 
periment. If the students can learn as much by watching the 
demonstration as by conducting the experiment, then it probably is 
more efficient to have the teacher perform the demonstration. 
Efficiency cannot be used as an excuse for teacher-dominated class- 
rooms. What may appear to be the most efficient way of teaching 
factual information may be the most inefficient way of teaching 
students how to make value judgments. Efficiency needs to be 
considered along with the educational objectives being sought in 
the classroom. 

Are the instructional methods being used fostering satisfactory 
development of the individual? A purpose of education in the 
public schools is to help each individual master the tasks that are 
necessary for effective participation in society. The most impor- 
tant test of the teacher's instructional methods is how successfully 
the students have grown as a result of the methods that have been 
used. No method, no matter how interesting, can be considered 
useful if the students have not been helped to grow intellectually, 
emotionally, and socially. 


The results of the evaluative process are used to improve the 
teaching-learning situation by: (1) introducing new learning situa- 
tions, (2) regrouping of students, (3) providing for remedial work, 
(4) using functional drill, and (5) providing for individual needs. 


V 


EVALUATION AND THE TEACHING-LEARNING SITUATION 419 


EVALUATION AND MOTIVATION OF LEARNING 


There is ample research evidence available to indicate that one of 
the most important elements in the teaching-learning process is the 
motivation of the individual student. While psychologists and edu- 
cators do not agree as to the exact nature of motivation, there is sub- 
stantial agreement that such a thing as a motivating force does exist 
for all individuals. The extent to which an individual is motivated 
will control the effort put into a task, the gains that will be made 
initially, and the long-range behavior changes which indicate that 
learning has resulted. 

Students may be compelled to engage in school activities by the 
use of threats and punishments, but the results produced by such 
methods frequently lead to negative reactions toward the entire 
educational process. The amount of learning which takes place is 
directly proportional to the degree of motivation that the student 
feels. 

Motivation may stem from the student's desire for recognition, his 
need for individual security, or his desire for new experiences. Tt 
is reasonably clear that motivation is a personal thing ; what moti- 
vates one student may not motivate another. The origins of personal 
motivation are to be found in the complex of forces that influence 
human behavior—heredity, family, friends, community, and experi- 
ence. 

Since motivation is a personal thing, it is necessary for the teacher 
to use the process of evaluation to discover the motivational stimuli 
appropriate for each student in the classroom. By using a variety of 
evaluation techniques, such as interest inventories, checklists, obser- 
vation guides, interviews, sociograms, personality tests, and rating 
scales, the teacher is helped in this task. The teacher then uses a 
variety of learning situations to capitalize on the potential that al- 
ready exists. 

By discovering the student's interests and needs, the teacher has 
also discovered the areas in which the student has little interest. 
This poses a real challenge to the teacher to promote motivational 
situations that will give the student new insights into areas that 
have not been important previously. Frequently, students will take 
mathematics because they are required to do so by the school or 


420 EVALUATING STUDENT PROGRESS 


by their parents. They may not be interested in the area, but the 
good teacher can give purpose to the study of mathematics and 
provide the stimulus that will motivate many youngsters to learn 
mathematics. A 

One form of stimulus used in the classroom to motivate students 
is represented by the program of evaluation. Unfortunately, much 
of the evidence about the relationship between motivation and 
evaluation is restricted to the use of tests as the only means of 
evaluation. The conclusions that have been reached, however, indi- 
cate that a complete program of evaluation is certain to have a major 
impact upon the motivation of students in the classroom. 

1. Students learn more successfully if they know and understand 
the progress they are making. While the research evidence pertain- 
ing to the frequency of evaluation and success in classwork is not 
conclusive, there is ample evidence to suggest that students make 
considerably more progress when their efforts are appraised and 
they are helped to see the progress they are making. The evaluation 
must help the students diagnose their difficulties and the teacher 
must be prepared to assist them to overcome the difficulties. 

The concept of continuous evaluation that has been described 
previously is especially important in connection with this generali- 
zation. Evaluation that is sporadic, or evaluation that takes place 
only at the end of grading periods or at the end of the semester, re- 
duces the potential learning value of the evaluation techniques. 
Continuous evaluation can assist the teacher and the students to en- 
gage in productive learning. 

A caution is necessary at this point. The research evidence also 
seems to suggest that the too frequent use of an evaluation tech- 
nique, such as daily testing, very soon reaches a point of diminishing 
returns. Just as the repetition of a single teaching technique may 
lead to boredom, so, too, will the constant use of one type of evalua- 
tion technique lead to a reduction of teaching efficiency. 

2. Students frequently adapt their study habits to the pattern of 
evaluation used by a teacher. It has been found that students tend 
to study for the types of tests that teachers give. If a teacher con- 
stantly uses factual examinations, the students will spend their 
study time memorizing facts; if teachers test for understanding of 
concepts, trends, and relationships, the students are more inclined 


EVALUATION AND THE TEACHING-LEARNING SITUATION 421 


to’ organize their materials to be able to answer such questions. 
Since the type of evaluation program does influence the manner 
in which students.study, it becomes increasingly important for 
teachers to consider seriously the types of evaluation instruments 
that they are using. 

Once again the concept of total evaluation has an important bear- 
ing on the learning habits of students. If the teacher uses evaluation 
procedures infrequently and bases grades upon those infrequent 
tests, students will be encouraged to cram for an examination and 
thé end purpose of education will become a grade on a report card. 
Continuous evaluation encourages the concept that education is a 
never-ending process and that what the student does in any one 
semester helps him to expand his horizons so that he will be better 
equipped to handle his present and future problems. 

Evaluation techniques can be a means for helping students develop 
good study habits ; they can be used to encourage students to locate 
errors; and they can be used to direct the progress of students 
toward desirable goals of achievement. 

3. Students are helped to evaluate their abilities, interests, and 
attitudes. In recent years increasing attention has been given to 
helping students develop intelligent plans for the future. Guidance 
programs, such as described in Chapter 16, have been developed in 
small, as well as large, high schools. These programs have as a 
fundamental aim the over-all adjustment of the individual in so- 
ciety. Every classroom teacher has the responsibility of assisting 
students to discover their strengths and weaknesses, to help them ex- 
plore new areas, to build new interests, and to counsel with them. 

Many of the informal techniques of evaluation are better suited 
to this purpose than the more formal teacher-made or standardized 
tests. Several self-report devices, such as the Kuder Preference 
Record, SRA Youth Inventory, and the Mooney Problem Check-list, 
can be used to help students find their interests and their problems. 
Once the information has been collected, it is important that the 
data be put to use. Frequently the technique is used but, unless 
efforts are made to help the student understand the results, self- 
evaluation cannot be very effective. 

There is one other aspect of the problem of motivation and evalu- 
ation that needs to be carefully considered. While our major concern 


422 EVALUATING STUDENT PROGRESS 


is the influence of tests, rating scales, interviews, and the like upon 
students, it is necessary to realize that these same techniques can 
and should influence the teacher. Tests, for example, should not only 
motivate students to study; they should motivate teachers to find 
ways and means of doing a better job of teaching. The development 
of a comprehensive program of evaluation provides the teacher with 
information about student attitudes, interests, and appreciations, 
as well as about academic achievement. 

The evaluation program should motivate the teacher not only to 
create better learning situations in the classroom but should also 
motivate the teacher to discover the causal factors influencing stu- 
dent behavior. Interviews, observations, student autobiographies, 
cumulative records, and case studies furnish the teacher with the 
needed background so that effective help can be given students. 
While the teacher generally deals with the symptoms of behavior, it 
may be possible, using the available tools of evaluation, to uncover 
the causal factors. Once the causal factors are located, the under- 
standing teacher may be able to provide real assistance to the stu- 
dent. 

While evaluation devices are frequently used to motivate student 
behavior, what is desired is to develop classroom conditions in which 
the individual learns because he wants to learn and not because a 
grade is held out as a final reward. An effort needs to be made to 
reduce the emphasis upon the grade and to increase the emphasis on 
the individual’s self-development. The question that teachers need 
to ponder is this: “How can I motivate my students so that they 
will want to learn even if they do not receive a grade at the end 
of the semester?” The answer to the question, in part, involves the 
intelligent use of the tools of evaluation. 


EVALUATION AND THE PSYCHOLOGICAL SECURITY 
OF THE TEACHING STAFF 
In any discussion of the teaching-learning process the impor- 
tance of the teacher needs to be considered carefully. While it is pos- 
sible for boys and girls to learn without teachers, which they fre- 
quently do in many nonschool situations, systematic and efficient 
learning probably would not take place unless there were teachers 


— 


EVALUATION AND THE TEACHING-LEARNING SITUATION 423 


to"facilitate the action. Not all teachers, however, can or do play 
the same role even though they find themselves in similar situations. 

Just as students differ, so, too, do teachers differ in their personal 
characteristics and in their techniques of teaching. The methods 
used by Mr. Jackson, for example, may be effective for him, but if 
. used by Mr. Thompson would be very ineffective. Each teacher 
needs to discover his own best way of using the personality char- 
acteristics, skills, and knowledges that he possesses. There isn’t any 
one best way of teaching. There are many good ways of teaching, 
and evaluation should help the individual teacher select those meth- 
ods most appropriate for him. 

There are many common hazards faced by the teacher which he 
must understand if he is to make adequate decisions regarding his 
most effective teaching methods. These hazards include: (1) per- 
sonal problems, (2) conflicts with others, (3) public criticism of 
teachers and teaching, (4) feelings of inadequacy, (5) lack of recog- 
nition, and (6) lack of security. 

Every teacher brings into the classroom personal problems that 
cannot be left at the front door. Illness in the family, financial diffi- 
culties, marital problems, and professional problems cannot be 
wiped away with a grin. While it is commonly expected that the 
teacher will be able to hide his problems under a cloak of teaching 
efficiency, it is totally unrealistic to assume that such problems will 
not influence the way that a teacher teaches. 

Another source of psychological difficulty for the teacher results 
from conflicts in values that may be held by others in, or associated 
with, the school system. Beginning teachers frequently find them- 
selves in conflict with teachers who have been in the school for a 
number of years. Many times teachers will discover that a conflict 
exists between the way they believe a class should be managed and 
the way that an administrator believes it should be done. Under 
other circumstances teachers may discover that the values they seek 
to develop in the classroom are being contradicted by the values 
held at home. Each of these conflict situations may create serious 
psychological problems for teachers. 

One of the most significant hazards faced by teachers is the feel- 
ing of inadequacy that comes from engaging in an activity in which 


424 EVALUATING STUDENT PROGRESS 


the results of one’s efforts are not always readily or immediately 
discernible. Because boys and girls come from so many different 
environments, because they do not mature at the same rate, because 
they have different learning rates, and because they respond differ- 
ently to identical situations, the teachers are faced with an array 
of intangibles that may be impossible to analyze. In this situation 
it is not surprising that teachers will frequently not know with cer- 
tainty what the fruits of their labor are. 

The conscientious teacher, seeking to provide the most effective 
learning situations for his students, may often have to wonder about 
the adequacy of his knowledge and the skill with which he teaches. 
The feelings of inadequacy are not limited by the grade or subject 
taught, for within each class there often appear to be youngsters that 
are difficult to reach and youngsters that are difficult to help. 
School situations may be particularly frustrating to the teacher 
Seeking to help boys and girls develop some of the less definitive 
objectives of education associated with interest, attitude, and ap- 
preciation. Many times the facts that are taught can be evaluated, 
but the understandings, critical thinking, and skills associated with 
the facts are far more difficult to evaluate. When feelings of in- 
adequacy are associated with a lack of recognition, serious psycho- 
logical problems for the teachers may arise. Research in many fields 
of human endeavor has shown convincingly that one of the basic 
needs of all individuals is recognition. This may come in the form 
of salary increases, but it may be just as effective if it is the ap- 
proval of one's friends or associates. Because teachers may often 
feel inadequate in their efforts to help youngsters learn, it may be 
doubly important that they secure recognition for the contribution 
they are making to society in other ways. 

There is one additional hazard that teachers may face and that 
is the psychological problem which arises from a feeling of insecurity. 
Lack of security is not only associated with salary and tenure; it 
stems also from the feelings of the individual that he is inadequate 
to do the job that he was hired to do. Security is a feeling that 
comes to the individual who knows that he is adequately prepared, 
confident that he knows what is expected of him, and sure that he 
can handle the emergencies that might come his way. Security is 
something that must be learned, and there are teaching situations 


EVALUATION AND THE TEACHING-LEARNING SIT UATION 425 


where this learning is thwarted and where security gives way to 
insecurity. 

While the problems posed above are formidable, they are not in- 
surmountable. Just as the methods of evaluation assist the teacher 
to help the student, so, too, do they assist the teacher to help him- 
self. While it is not possible to prepare a list of suggestions for the 
solution of all personal problems of the teacher, the following sug- 
gestions are made to facilitate self-analysis : 

1. Know yourself. In many of the preceding chapters a great deal 
of ‘emphasis has been placed upon discovering the strengths and 
weaknesses of students. The point is made over and over again that 
the better we know students the better able we are to help them. The 
same generalization applies to teachers facing the psychological haz- 
ards of teaching. There isn't anything seriously wrong with the fact 
that all teachers do not have the same personality characteristics ; if 
they did, schools might be a very dull place. What is important is 
that the teacher know himself—that he recognize what he can do 
and what he cannot do; that he utilize his positive characteristics 
to their maximum and work to overcome any limitations of knowl- 
edge, skill, and attitude that are evident. 

2. Know what you are trying to accomplish. Much of the frustra- 
tion associated with teaching can be eliminated if the teacher will 
take the necessary steps to determine where he wants to go. In 
Chapters 3 and 4 the discussion associated with the develop- 
ment of objectives was designed to help the teacher determine his 
goals. The objectives of education should form the guideposts for 
both students and teachers, providing a sense of direction which is 
so important to the development of security. 

3. Know what you have accomplished. Yt is not only comfortable 
to know where you want to go; it is very reassuring to know how 
far you have progressed. The theme of this entire book has been that 
it is possible to measure the progress of students toward the ob- 
jectives of education. While it is recognized that at present not all 
of the objectives of education can be as adequately measured as we 
would like, the fact remains that the teacher can at the present time 
secure many valid measurements of student progress. Knowing what 
we have accomplished is one of the best ways of developing a feel- 
ing of adequacy and security. Should we not receive recogni- 


426 EVALUATING STUDENT PROGRESS 


tion for our efforts from other sources, the knowledge that we have 
succeeded in helping our students learn represents an important 
form of recognition. 

4. Know the limitations of the teaching situation. The best of 
teachers cannot make brilliant scientists out of dullards or change 
completely the values that have been inculcated by family, friends, 
and community, What can be done is to discover as much as pos- 
sible about students and the environments in which they live, and 
then make realistic decisions about what can and what cannot be 
done. If we set the level of aspiration too high for the students and 
they are not capable of reaching them, then not only we but also 
the students are frustrated. It is important to recognize, however, 
the necessity of setting goals high enough so that we challenge the 
students rather than permit them to “coast”? along. The limitations 
in the teaching situation are not all confined to the students. Many 
limitations exist because of lack of facilities, poverty of the com- 
munity, rigidity of curriculum, and administrative dictum. Each of 
these conditions must be taken into account when the teacher at- 
tempts to appraise what has happened in the classroom. A realistic 
appraisal results in a reduction of the pressures that create psycho- 
logical hazards for teachers. 

5. Know how you can improve yourself. In the earlier sections of 
this chapter the relationship of evaluation to the improvement of 
instruction and the modification of the curriculum were considered. 
The generalizations made there are also pertinent to the present de- 
cussion. To avoid the hazards associated with teaching, it is sug- 
gested that the individual develop self-awareness, deliberately seek 
new experiences, look for help on specific questions, develop supple- 
mentary areas, and recognize new possibilities in teaching. 

There are unquestionably many psychological hazards in teach- 
ing. However, it is probably correct to surmise that many of these 
hazards can be minimized if efforts are made to do so. The mental 
health of the teacher is of the utmost importance, not only to the 
teacher himself, but to the students and society as well. The respon- 
sibility for maintaining optimum mental health cannot be delegated. 
One of the best means by which this health can be maintained is by 
engaging in a continuous program of self-evaluation. 


E et 


| 
t 
2 


^ Selected References 


ApAMs, GxoncIA S., and Torcerson, T. L. Measurement and Evalu- 
ation for the Secondary-School Teacher. New York: Dryden Press, 
1956. 

Apxiws, Dororuy C. Construction and. Analysis of Achievement 
Tests. Washington, D.C.: U.S. Government Printing Office, 1947. 

ANASTASI, ANNE. Psychological Testing. New York: Macmillan Co., 
1954. 

Bran, K. L. Construction of Educational and Personnel Tests. New 
York : McGraw-Hill, 1953. 

Buros, Oscar K. (ed.). The Fourth Mental Measurements Yearbook. 
Highland Park, N.J.: Gryphon Press, 1953. 

Cronpacn, Ler J. Essentials of Psychological Testing. New York: 
Harper and Brothers, 1949. 

FroEHLICH, CLIFFORD P., and DarLEY, Jonn G. Studying Students. 
Chicago: Science Research Associates, 1952. 

Greene, H. A.; JORGENSEN, A. N.; and GERBERICH, J. R. Measure- 
ment and Evaluation in the Elementary School (2nd ed.). New 
York: Longmans, Green and Co., 1953. 

Greene, H. A.; JORGENSEN, A. N.; and GERBERICH, J. R. Measure- 
ment and Evaluation in the Secondary School (2nd ed.) New 
York: Longmans, Green and Co., 1954. 

Gutrixsen, Hamorp. Theory of Mental Tests. New York: John 
Wiley and Sons, 1950. 

Henry, Netson B. (ed.). Measurement of Understanding. (Forty- 
fifth Yearbook, Part I, National Society for the Study of Educa- 
tion.) Chicago: The University of Chicago Press, 1946, 

Humpureys, J. ANTHONY, and TRAXLER, AnrHUR E. Guidance Serv- 
ices. Chicago: Science Research Associates, 1954. 


427 


428 SELECTED REFERENCES 


Jorvan, A. M. Measurement in Education. New York: McGraw- 
Hill, 1953. 

LiwpQuisr, E. F. (ed.). Educational Measurement. Washington, 
D.C.: American Council on Education, 1951. 

Micueets, Wrram J., and Karnes, M. Ray. Measuring Educa- 
tional Achievement. New York: McGraw-Hill, 1950. 

Monroe, Warrer S. (ed.). Encyclopedia of Educational Research 
(rev. ed.). New York: Macmillan Co., 1950. 

Opzzr, C. W. How to Improve Classroom Testing. Dubuque, Ja.: 
William C. Brown, 1953. 

Remmers, H. H., and Gace, M. L. Educational Measurement and 
Evaluation (rev ed.). New York: Harper and Brothers, 1955. 
Ross, C. C., and Srantry, Juran C. Measurement im Today's 

Schools (3rd ed.). New York: Prentice-Hall, Inc., 1954. 

Tuomas, R. Murray. Judging Student Progress. New York: Long- 
mans, Green and Co., 1954. 

THORNDIKE, RoserT L., and Hacen, ELIzaBETH. Measurement and 
Evaluation in Psychology and Education. New York: John Wiley 
and Sons, 1955. 

Torcrrson, T. L., and Apams, Grorcia S. Measurement and Evalu- 
ation for the Elementary-School Teacher. New York: Dryden 
Press, 1954. 

Travers, Ropert M. W. Educational Measurement. New York: 
Macmillan Co., 1955. 

Travers, Rogert M. W. How to Make Achievement Tests. New 
York: Odyssey Press, 1950. 

Traxter, ArtHur E.; Jacoss, ROBERT; Setover, MARGARET; and 
TOWNSEND, AGATHA. Introduction to Testing and the Use of Test 
Results in the Public Schools. New York: Harper and Brothers, 
1953. 

TRAXLER, ARTHUR E. Techniques of Guidance. New York: Harper 
and Brothers, 1945. 

WestzMan, Extis, and McNamara, Warter J. Constructing Class- 
room Examinations: A Guide for Teachers. Chicago: Science Re- 
search Associates, 1949. 

WRICHTSTONE, J. WAYNE; JusrMAN, Josep; and RossIns, IRVING. 
Evaluation in Modern Education. New York: American Book Co., 
1956. 


academic aptitude. See scholastic apti- 
tude. 
accrediting agencies, 26 
Achievement Examinations for Second- 
ary Schools, 307 
Achievement Examinations for Second- 
ary Schools (Languages), 309 
achievement tests, 267-68, 271-77, 
302-9. See standardized tests and 
teacher-made tests. 
diagnostic, 304 
informal, 109-39, 140-51 
selection, 305-6 
survey tests, 303-4 
test batteries, 304-6 
Adams, Georgia, 263 
Adjustment Inventory, 317 
Adkins, Dorothy, 122 
administration, of testing program, 98- 
108 
Allport, Gordon, 33 
American Council Psychological Exam- 
ination, 275, 288 
American Council Solid Geometry Test, 
308 
American Council Trigonometry Test, 
308 
Anderson Chemistry Test, 308 
anecdotal records, 88-89, 197-206 
advantages and values of, 204-6 
characteristics of, 197-99 
limitations of, 203-4 
record forms, 200-202 
writing suggestions, 197-203 


Index 


answer sheets, 100-101 
appreciation, 50, 62-73 
aptitude, 276, 291-302 
achievement tests as measures of, 
302-9 
definition of, 292-93 
scholastic aptitude, 279-91 
Armed Forces Qualification Test, 3 
Army Alpha, 7, 288-89 
Army General Classification Test, 3 
Arthur Point Scale of Performance, 
288 
A-S Reaction Study, 317 
attitudes, description of, 50, 62-73 
autobiography techniques, 227-31 
average. See central tendency. 


Barrett, Ryan, Schrammel English Test, 
308 

Bean, Kenneth L., 122, 152 

behavior determinants, 17-30, 345-48 

Behavior Inventory, 173-79 

behaviors defined for evaluative pur- 
poses, 21, See objectives. 

Bell, Howard M., 32 

Bernard, Harold W., 227, 240 

Beta Examination, 288 

Bettelheim, Bruno, 32 

Binet, Alfred, 6 

Brady, Elizabeth Hall, 230 

Brainard Occupational Preference Test, 
171 

Brueckner, Leo J., 213 

Buros, Oscar K., 263 


429 


430 


California Achievement Test, 275, 306, 
355-57 
California Reading Test, 275 
California Short-Form Test of Mental 
Maturity, 274, 281, 288 
California Test of Personality, 
315-16 
California Tests in Social and Related 
Sciences, 309 
case conference, 258-60 
case study, 242-60 
importance of, 243-44 
limitations, 250-51 
sample study, 252-57 
steps in developing, 244-46 
central tendency, 323-29 
mean, 323-24, 328 
median, 323-24, 329 
mode, 325 
checklists, 156-62 
class size interval in frequency distribu- 
tions, 325-27 
Cleeton Vocational Interest Inventory, 
171 
collecting test items, 96-98 


276, 


Columbia Research Bureau French 
Test, 309 

Columbia Research Bureau Spanish 
Test, 309 


comparable forms method of computing 
reliability, 83-84 
concurrent validity, 76, 78-79 
construct validity, 76, 80-81 
content of classroom programs, 56- 
59 
American history, 57-58 
art, 66-67 
biology, 72-73 
core, 68-71 
social dancing (physical education), 
62-65 
content validity, 76-78 
Cook, Walter W., 349-50 
Cooperative General 
Tests, 306 
Cooperative School and College Ability 
Tests, 288, 352-55 
correlation, 79-82, 84-86, 340-44 
cost of evaluation program, 90-91 
Crary American History Test, 309 


Achievement 


INDEX 


criteria for evaluating measurement 
techniques, 74-93 

efficiency, 87-91 
objectivity, °86-87 * 
reliability, 81-86 
usefulness, 91-93 
validity, 75-81 

critical thinking, 21, 50-53, 62-73 

Cronbach, Lee J., 268 ` 

Cummings World History Test, 309 

cumulative records, 376-81 

curriculum improvement, 413-26 


Darley, John G., 169 

Davis Test of Functional Competence 
in Mathematics, 308 

democracy and the evaluative process, 
13-15 

descriptive rating scales, 163-64 

Detroit Reading Test, 275 

diagnosis from measurement, 345-72 

diagnostic tests, 348-63 

Diagnostic Tests of Achievement in 
Music, 309 

Diederich, Paul, 363 

Differential Aptitude Tests, 275-76, 301 

Dressel, Paul L., 53 

Durrell-Sullivan Reading Capacity and 
Achievement Test, 235 


Ebel, Robert, 153 
economy of effort in testing program, 
89-90 
education, continuous nature of, 41-42 
educational diagnosis, 345-72 
educational factors influencing behavior, 
19-20 
efficiency in use of techniques of meas- 
urement, 87-91 
Eight-Year Study, 8, 21, 163 
Engineering and Physical Science Apti- 
tude Test, 276 
environmental factors influencing be- 
havior, 19-20 
Erickson, Clifford, 250-51 
error in measurement, 344, 353-55 
essay tests, 140-51 
criticisms of, 141-51 
grading suggestions, 149-51 
suggestions for writing, 147-49 


— —— E rt at 


INDEX 431 


essay tests (cont.) 
when to use, 146-47 
Essential "High. School Test Battery, 
276, 307 . n 
evaluation: 
administrative uses of, 11, 24 
Board of Education and, 25 
Civil Service ‘and, 4 
current trends in, 8-10 
definitions, 1-2, 17 
guidance and, 11, 373-87 
history of, 6-10 
human values and, 13-15 
military services and, 3 
nonschool activities and, 22-24 
process of, 26-30 
program, cost of, 90-91 
psychology of Jearning and, 15-16 
teacher use of, 10, 22, 24, 413-26 
techniques of, 10-11 
Evaluation and Adjustment Series for 
Secondary Schools, 307 
evaluative process, nature of, 17-18 
expectancy chart, 361-62 
factors influencing behavior, 19-20, 
345-48 
Fenton, Jane, 259 
Flanagan Aptitude Classification Tests, 
276, 295-300 
follow-up studies, 25-26 
Frank, Lawrence K., 239 
free-response tests, 111-12 
frequency distributions, 325-29 
Froehlich, Clifford E., 169 


Gage, N. L., 292 
Galton, Francis, 6 
Gates Basic Reading Test, 275 
General Aptitude Test Battery, 301 
Gerberich, J. R, 292 
Goodenough Draw-A-Man Scale, 288 
grading, 321, 391-95, 404 
graphic rating scales, 165 
Greene, H. A., 292 
group relationships, 50 
guidance, 11, 373-87 
concepts of, 373-75 
cumulative records, 376-81 
student and, 381-83 


guidance (cont.) 
teacher and, 383-87 

Harris, Fred, 392-93 

Harrison-Stroud Reading Readiness, 274 

Havighurst, Robert. J., 33, 385 

health factors influencing behavior, 19 

Henmon-Nelson Tests of Mental Abil- 
ity, 274 4 

Hildreth, Gertrude H., 263 

Holzinger-Crowder Uni-factor 
275 

Humphreys, J. Anthony, 171 


Tests, 


“Imperative Needs of Youth," 35 
improvement of teaching-learning situ- 
ation, 413-26 
instructional improvement, 416-18 
intelligence quotients, 279-83, 286 
intelligence testing. See scholastic apti- 
tude. 
interest inventories, 276 
interests, 50-51, 62-73, 310-14 
interviews, 206-11 
advantages and limitations, 210-11 
suggestions for improving, 209-10 
types of, 206-8 
values of, 208 
inventories, 169-81 
Iowa Every-Pupil Test of Basic Skills, 
273, 276 
Iowa High School Content Examina- 
tion, 307 
Iowa Silent Reading Test, 275 
Iowa Tests of Educational Develop- 
ment, 273, 276, 307, 358-59 


Jennings, Helen, 214, 224 
Jorgensen, A. N., 292 
Justman, Joseph, 292 


Karnes, M. Ray, 123 

Kelley-Greene Reading Test, 275 

knowledge, description of, 49, 54-55, 
62-73 

Kornhauser, A. W., 287 

Kuder Preference Record, 89, 171, 276, 
311-12 

Kuhlmann-Anderson Intelligence Test, 
274, 288 

Kuhlmann-Finch Intelligence Test, 274 


432 


Lankton First Year Algebra Test, 308 

learning, principles of, 15 

Lee-Clark Arithmetic 
Survey Test, 308 

Lee-Clark Reading Readiness Test, 274 

Lee-Thorpe Interest Test, 171 

Lewin, Kurt, 32 

limitations of teacher-made tests, 110- 
11 

Lorge-Thorndike Intelligence Tests, 208 


Fundamentals 


machine scoring, 89-90, 101, 104-7 
matching test items, 137-38 
Mayhew, Lewis B., 53 
mean, 323-24, 328 
meaning of a score, 321 
measurement: ' 
efficiency in use of techniques, 87-91 
meaning of, 1-2 
Mechanical Comprehension Test, 302 
median, 323-24, 329 
mental ability tests. See scholastic apti- 
tude. 
mental age, 280-83 
Mental Health Analysis, 276 
Metropolitan Achievement Tests, 275 
Metropolitan Readiness Test, 274 
Metropolitan Reading Test, 275 
Micheels, William J., 123 
Midwest High School Achievement Ex- 
aminations, 308 
Minnesota Clerical Test, 301-2 
mode, 325 
Mooney Problem Check List, 160-61, 
170, 276, 318 
Mosier, Charles L., 122 
motivation, 419-22 
Multiple Aptitude Test, 339-40, 360-61 
multiple-choice test items, 122-36 
advantages and limitations, 129-30 
suggestions for writing, 130-36 
versatility of items, 122-29 
Myers, M. C., 122 
Myers-Ruch High School Progress Test, 
307 


Nelson Biology Test, 361 

Nelson High School English Test, 307 
Nelson-Denny Reading Test, 275 
Nixon, John E., 187 


INDEX 


normal distribution, 333-38 
norms, 337-40 
numerical rating scales, 164 
objective test items, 109-39 
constructing matching  test' items, 
137-38 


constructing multiple-choice test 
items, 122-36 y . 

constructing  true-false test items, 
113-21 = 


objectives, 32-47 
general statements, 34-36 
problems in analysis, 41-47 
relationship of general to specific, 
38-40 
relationship of learning experiences 
to, 43-44 
sources of, 32-34 
two-dimensional grid, 60-73, 94-96 
value of defining, 44-47 
objectivity, 86-87, 111 
observation, 189-97 
advantages and limitations, 193-94 
group climate, 192-93 
suggestions for improvement, 194-97 
Occupational Interest Inventory, 276, 
310-11 £ 
Ohio State University Psychological 
Test, 288 
Otis Quick-Scoring Mental Abilities 
Test, 274, 288 


parent-teacher conferences, 406-7 
Paterson, Gerald, 7 

percentile norms, 91-92 

percentile scores, 322 

permissive discussion, 239 
personal data blank, 231-38 


personality, and social adjustment, 
314-19 S 

personality and adjustment inventories, 
171-73, 276 


physical factors influencing behavior, 19 

Pintner, Rudolph, 7 

Pininer-Paterson Scale of Performance 
Tests, 288 

power tests, 267 

predictive validity, 76, 79-80 

Price, Helen G., 122 


INDEX 


Profile Index, 363-71 

profiles, 280-81, 352-63 

projective. techniques, 239-41, 315, 318- 
19 "m A 

psychological factors influencing be- 
havior, 19-20 

publishers ` of standagdized tests, 277, 
319 : 


quartile deviation, 330-32 
questionnaires, 181-88 


range of scores, 322-23, 330 
ranking of scores, 321-22 
rating scales, 163-69 
readiness tests, 274 
Reading Comprehension Test, 275 
reliability, 81-87 
of self-report techniques, 230-31 
Remmers, H. H., 292 
"report cards, 389-403, 404, 412 
report forms, 390, 396-402 
reporting practices, 388-412 
improvement by teachers, 404-12 
parent-teacher conferences, 406-07 
purposes of, 388-89 
self-evaluation, 407-11 
reproducing examinations, 99-100 
Rivlin, Harry N., 187, 247-50 
Robbins, Irving, 292 
Robinson, John T., 230 
role-playing, 223-26 
Rorschach Test, 5, 240, 318 


scattergram, 341 
scholastic aptitude, 279-91 
group tests, 288 
group vs. individual tests, 283-84 
individual tests, 288 
misinterpretation of results, 287 
performance vs. verbal tests, 285-87 
scores; range of, 322-23, 330 
ranking of, 321-22 
statistical interpretation of, 320-44 
scoring examinations, 104-7 
self-evaluation, 16, 106-7, 162, 172-73, 
227-31, 407-11 
sentence technique, incomplete, 229-30 
short-answer tests, 151-54 
skills, 49, 52-56, 62-73 


433 


Smith, Eugene, 164 
sociodrama, 223-26 
sociograms, 214-22 
sociometry, *214-22 
speed tests, 267 
split-halves reliability, 84 
SRA Achievement Series, 276 
SRA Personal Audit, 276 
SRA Primary Mental Abilities, 274, 
284, 288 
SRA Youth Problems Inventory, 170, 
180-81, 276, 318 
standard deviation, 330, 333-38 
standard scores, 336-38 
standardized tests, 261-319 
achievement, 267-68, 271-77, 302-9 
aptitude, 276, 291-302 
characteristics of, 262-63 
critical study of, 350-51 
general uses, 92, 268-70 
group tests, 265 
individual tests, 265-66 
interest, 276, 310-14 
performance, 266 
personality and social adjustment, 
276, 314-19 
publishers, 277, 319 
readiness, 274 
scholastic aptitude, 274, 279-91 
testing program, 271-77 
types, 263-68 
Stanford Achievement Tests, 275 
Stanford Reading Test, 275 
Stanford-Binet, 79, 266, 268 
stanine scores, 300 
statistical interpretation of scores, 320- 
44 
central tendency, 323-29 
correlation, 79-82, 84-86, 340-44 
error in measurement, 344, 353-55 
frequency distribution, 325-29 
mean, 323-24, 328 
median, 323-24, 329 
mode, 325 
normal distribution, 333-38 
norms, 337-40 
percentiles, 91-92, 322 
profiles, 280-81, 352-63 
quartile deviation, 330-32 
range of scores, 322-23, 330 Š 


434 


statistical interpretation of scores (cont.) 
ranking scores, 321-22 
standard deviation, 330, 333-38 
standard scores, 336-38 * 

Strang, Ruth, 403 

Strong Vocational Interest Blank, 171, 

276, 312-13 
student behaviors, describing, 48-56 
student-teacher planning, 59 


Taba, Hilda, 32, 230 
Taxonomy of Educational Objectives, 
54-56 
teacher-made tests, 109-39 
essay tests, 140-51 
matching test items, 137-38 
multiple-choice test items, 122-36 
purposes of, 109-10 
short-answer, 151-54 
true-false test items, 113-21 
teachers, psychological security of, 
422-26 
teaching-learning situation, 
ment of, 413-26 
Terman, Lewis, 7 
Terman Group Tests of Mental Ability, 
275 
Terman-McNemar Test of Mental Abil- 
ity, 288 
test batteries, 348-63 
test construction, steps in, 112-13 
test directions, 101-3 
test item file, 97-98, 107 
test length, 103-4 
test publishers, 277, 319 
test score interpretation, 320-44, 353-63 
testing program, 271-77 
administration of, 98-108 
economy of effort in, 89-90 
test-retest reliability, 82-83 
Thematic Apperception Test, 5, 240-41, 
266, 318 


improve- 


INDEX 


Thorndike, Edward L., 7 > 
Thurstone Temperament Schedule, 276 
Thurstone Test of Mental Alertness, 274 
Torgerson, Theodore L., 173, 263 
Travers, Robert, 141 
Traxler, Arthur E., 171, 204, 242, 271, 
278, 202, 391-02 
Triggs Diagnostic Reading Test,.275 
true-false test item, 113-21 
advantages, 114 
limitations, 114-15 
suggestions for writing, 115-19 
variations, 119-21 Aa 
T-score, 336-38 
two-dimensional grid, 60-73, 94-96 
Tyler, Ralph, 21, 33, 60, 164 


University of Chicago Laboratory 
School report forms, 396-400 


validity, 75-81 

of self-report techniques, 230-31 
variability of scores, 330-37 
Vickery, William E., 230 
vocational interest inventories, 171 
vocational preferences, 310-14 


Warren, H. C., 292 

Warters, Jane, 258 

Watson-Glaser Critical Thinking Ap- 
praisal, 309 

Wechsler Adult Intelligence Scale, 288 

White House Conference on Education, 
36 

Wrightstone, J. Wayne, 292 

Wundt, Wilhelm, 6 


Yale Educational Aptitude Test Bat- 
tery, 301 


Zachary, Caroline B., 32 
z-score, 336-38 


